Tenso r SECOND EDITION 
Analysis 


Theory and Applications to Geometry and 


Mechanics of Continua 


I. S. Sokolnikoff 


about the book... 


The second edition of this popular 
text continues to provide a clear and 
explicit introduction to the applica- 
tions of mathematics. Because of the 
- importance of linear transformations 
in motivating the development of 
tensor theory, the author begins the 
book with a discussion of linear 
transformations and matrices. He 
then proceeds with a self-contained 
presentation of algebra and calculus 
of tensors. 

Dr. SokolInikoff has expanded cer- 
tain sections on the uses of calculus 
of variations in geometry in a gen- 
eral discussion of those geometrical 
topics that are important in the study 
of analytical dynamics and mechan- 
ics of continuous media. The essen- 
tial concepts of analytical mechanics 
are presented in a somewhat ex- 
panded but still concise manner. 
Relativistic mechanics is introduced 
and illuminated by a number of illus- 
trative examples. The concluding 
chapter, devoted to mechanics of 
continua, has been rewritten to pre- 
sent the essentials of the non-linear 
theory of mechanics of deformable 
media from a unified point of view. 

The Second Edition of TENSOR 
ANALYSIS provides a careful and 
broad introduction to the develop- 
ment of tensor theory and its appli- 
cations to geometry, mechanics, 
relativity and mechanics of continu- 
ous media. 


N.U.—5829 


RICHARD C. FREY 
1003 SUNSET 


CINCINNAT! 5, OHIO 


Ric warn C, i 


faae ns bi j 


sea TA kine TO rT, 
i) 


APPLIED MATHEMATICS SERIES 
Edited by 


I. S. SOKOLNIKOFF 


TENSOR ANALYSIS 
THEORY AND APPLICATIONS TO GEOMETRY 
AND MECHANICS OF CONTINUA 


APPLIED MATHEMATICS SERIES 


The Applied Mathematics Series is devoted to books 
dealing with mathematical theories underlying physical 
and biological sciences, and with advanced mathematical 
techniques needed for solving problems of these sciences. 


TENSOR ANALYSIS 


THEORY AND APPLICATIONS TO GEOMETRY 


AND MECHANICS OF CONTINUA 


Second Edition 


I. S. SOKOLNIKOFF 


PROFESSOR OF MATHEMATICS 
UNIVERSITY OF CALIFORNIA 
LOS ANGELES 


JOHN WILEY & SONS, INC. 
NEW YORK - LONDON - SYDNEY 


COPYRIGHT, 1951 © 1964 
BY 
Jonn WILEY & Sons, INC. 


All Rights Reserved. 


This book or any part thereof must not 
be reproduced in any form without the 
written permission of the publisher. 


SECOND PRINTING, DECEMBER, 1965 


Library of Congress Catalog Card Number: 64-13223 , 


PRINTED IN THE UNITED STATES OF AMERICA 


PREFACE TO THE SECOND EDITION 


In preparing the Second Edition of this book I have been guided by 
suggestions kindly made to me by users of the First Edition. There 
appeared to be no compelling reasons for making major changes in the 
introductory chapter concerned with linear transformations and matrices, 
or in the second chapter, devoted to algebra and calculus of tensors. 

In Chapter 3 some sections concerned with the uses of calculus of 
variations in geometry have been expanded, some new illustrative material 
introduced, and two new sections, on parallel surfaces and the Gauss- 
Bonnet theorem, have been added. Chapters 2 and 3 in the present 
edition contain adequate material for an introductory course on metric 
differential geometry at the beginning graduate level or, for that matter, 
at the upper-division undergraduate level. 

Chapter 4, dealing with analytical mechanics, has been expanded. It 
contains a distillation of the essentials of classical analytical mechanics 
and potential theory, which, together with Chapter 5 on relativistic me- 
chanics, should be, but often is not, a part of the equipment of every 
student of mathematics. A number of illustrative examples that further 
illuminate the theory have been introduced, and the discussion of non- 
holonomic dynamical systems, of Hamilton’s canonical equations, and 
of potential theory has been made more detailed. 

The concluding chapter, devoted to mechanics of continua, was entirely 
rewritten. It presents from a unified point of view and, it is hoped, with 
sufficient clarity, the essentials of the nonlinear theory of mechanics of 
deformable media. This chapter provides a common basis for a careful 
development of the mathematical theories of elasticity, plasticity, hydro- 
dynamics, and gas dynamics. 

i I. S. SOKOLNIKOFF 


Pacific Palisades, California 
January 1964 


PREFACE TO THE FIRST EDITION 


This book is an outgrowth of a course of lectures I gave over a period of 
years at the University of Wisconsin, Brown University, and the Univer- 
sity of California. My audience consisted, for the most part, of graduate 
students interested in applications of mathematics, and this fact shaped 
both the content and the character of exposition. 

Because of the importance of linear transformations in motivating the 
development of tensor theory, the first chapter in this book is given to a 
discussion of linear transformations and matrices, in which stress is placed 
on the geometry and physics of the situation. Although a large part of 
the subject matter treated in this chapter is normally covered in courses 
on matrix algebra, only a few of my listeners have had the sort of appre- 
ciation of matrix transformations that an applied mathematician should 
have. 

The second chapter is concerned with algebra and calculus of tensors. 
The treatment in it is self-contained and is not made to depend on some 
special field of mathematics as a vehicle for the development of tensor 
analysis. This is a departure from the customary practice of making 
geometry or relativity a medium for the unfolding of tensor analysis. 
Although this latter practice has a great deal to commend it because it 
provides a simple means for motivating the study of tensors, it often 
leaves an erroneous impression that the formulation of tensor analysis 
depends somehow on geometry or relativity. 

The remaining four chapters in this volume deal with the applica- 
tions of tensor calculus to geometry, analytical mechanics, relativistic 
mechanics, and mechanics of deformable media. Thus, Chapter 3 con- 
tains a selection of those geometrical topics that are important in the 
study of analytical dynamics and in such portions of elasticity and plas- 
ticity as deal with the deformation of plates and shells. This chapter 
provides a substantial introduction to the subject of metric differential 
geometry. In Chapter 4, the essential concepts of analytical mechanics 
are presented adequately and concisely. An introduction to relativistic 
mechanics is contained in Chapter 5. The treatment there was inten- 
tionally made very brief because some excellent books on relativity have 
appeared recently and there seems little point in duplicating their contents. 

Vil 


viii PREFACE TO THE FIRST EDITION 


The final chapter of the book is concerned with a formulation of the 
essential ideas of nonlinear mechanics of continuous media in the most 
general tensor form. The classical linearized equations of elasticity and 
fluid mechanics appear as special cases of the general treatment. 

Perhaps the best evidence of the remarkable effectiveness of the tensor 
apparatus in the study of Nature is in the fact that it was possible to 
include, between the covers of one small volume, a large amount of 
material that is of interest to mathematicians, physicists, and engineers. 

A survey of applied mathematics as broad as that in this book must 
inevitably reflect contributions of so many scholars that it is futile to 
attempt to assign proper credit for original ideas or methods of attack. 
However, in the treatment of geometry, the influence of T. Levi-Civita 
and A. J. McConnell, whose books (especially McConnell’s Applications 
of the Absolute Differential Calculus) I used in my classes for many 
years as required reading, is clearly discernible. Specific acknowledg- 
ments to these and other authors are made in the appropriate places in 
the text. However, my greatest debt is to my listeners, who have made 
the job of writing this book seem both enjoyable and worth while. 

It is a particular pleasure to single out among my listeners Mr. William 
R. Seugling, Research Assistant at the University of California at Los 
Angeles, who gave unstintingly of his time in following this book through 
press. 


I. S. SOKOLNIKOFF 
Los Angeles, California 
November 1951 


WAAINHMNEWHN 


CONTENTS a 


1 LINEAR VECTOR SPACES. MATRICES 


. Coordinate Systems 
. The Geometric Concept of a Vector 
. Linear Vector Spaces. Dimensionality of Space 


N-Dimensional Spaces 


. Linear Vector Spaces of n Dimensions 

. Complex Linear Vector Spaces 

. Summation Convention. Review of Determinants 

. Linear Transformations and Matrices 

. Linear Transformations in Euclidean 3-space 

. Orthogonal Transformation in Es 

. Linear Transformations in n-Dimensional Euclidean Spaces 
. Reduction of Matrices to the Diagonal Form 

. Real Symmetric Matrices and Quadratic Forms 

. Illustrations of Reduction of Quadratic Forms 

. Classification and Properties of Quadratic Forms 

_ Simultaneous Reduction of Two Quadratic Forms to a Sum of Squares 
. Unitary Transformations and Hermitean Matrices 


2 TENSOR THEORY 


. Scope of Tensor Analysis. Invariance 

. Transformation of Coordinates 

. Properties of Admissible Transformations of Coordinates 

. Transformation by Invariance 

. Transformation by Covariance and Contravariance 

. The Tensor Concept. Contravariant and Covariant Tensors 


Tensor Character of Covariant and Contravariant Laws 


. Algebra of Tensors 


Quotient Laws 


. Symmetric and Skew-Symmetric Tensors 
. Relative Tensors 5 

. The Metric Tensor 

. The Fundamental and Associated Tensors 
. Christoffel’s Symbols 

. Transformation of Christoffel’s Symbols 
. Covariant Differentiation of Tensors 

. Formulas for Covariant Differentiation 

. Ricci’s Theorem 

. Riemann-Christoffel Tensor 


xii CONTENTS 


122. Fluid Mechanics. Equations of Continuity 

123. Ideal Fluids. Euler’s Equations 

124. Viscous Fluids. Navier’s Equations 

125. Remarks on Turbulent Flows and Dissipative Media 


Bibliography 
Index 


344 
346 
349 
352 


353 
359 


l 7 
LINEAR VECTOR SPACES. MATRICES 


1. Coordinate Systems 


In order to locate a geometrical configuration a reference frame is 
needed. Among the simplest reference frames used in mathematics are 
the cartesian coordinate systems. Although the construction of such 
coordinate systems is familiar to the reader from courses in analytic 
geometry, we review it here in order to set in relief certain basic notions 
that underlie the concept of coordinates covering the space of our physical 
intuition. This review will pave the ground for some far-reaching generali- 
zations of the concept of physical space, which we formulate in Sec. 4. 

The cardinal idea responsible for the invention of coordinate systems 
by Descartes is the identification of the set of points composing a straight 
line with the totality of real numbers. It consists of the assumption that 
to each real number there corresponds a unique point on a straight line, 
and conversely.? 

We choose a straight line X and a point O on it (Fig. 1). This point 
O, which we call the origin, divides the line into two half-rays. We 


Q A P K 


— QO 


= O 
Fig. 1. 


1 Although the idea of one-to-one reciprocal correspondence between the set of 
points composing a line and the totality of real numbers had it roots in the Eudoxus 
theory of incommensurables, dating back to the fourth century B.C., the invention of 
coordinate systems did not come until the first part of the seventeenth century. It should 
be also noted that a rigorous analysis of the relation between linear sets of points and 
real numbers was made only during the closing years of the last century, chiefly through 
the efforts of Dedekind and Cantor. The concept of rigor depends entirely on conven- 
tions dictated by prevailing tastes indicative of the degree of mathematical sophistication 
in a given chronological period. Fruitful intuitive concepts are usually made rigorous 
by (a) making explicit agreements as to which ideas fall into a category of definable 
concepts and which do not, and (b) introducing into mathematical theories new modes 
of reasoning which (one hopes) are free of contradiction. 

1 


2 LINEAR VECTOR SPACES. MATRICES [CHaP. 1 


Fig. 2. 


designate one of these as the positive and the other as the negative half-ray. 
On the positive half-ray we choose a point A and call the length of the 
line segment OA the unit length. We next coordinate points on X with a 
set of real numbers in the following way: If P is any point on the positive 
half-ray, we define a number v associated with P by the formula 


where OP and OA are lengths of the line segments OP and OA. The 
number x is the coordinate of P. The coordinate x of the point Q on 
the negative half-ray is defined by the ratio 


We also assume that each real number x corresponds to one and only 
one point on X. This association of the set of points on X with the set 
of real numbers constitutes a coordinate system of the one-dimensional 
space consisting of points on X. 

The coordination of the set of points lying in the plane with sets of 
real numbers is accomplished by taking two straight lines X, and X, 
intersecting at a single point O (Fig. 2). On each line a coordinate system 
is constructed as above, but the units on each line need not be equal. 
A pair of such lines with unit points A and B marked on them form the 
coordinate axes X,, X,. With each point P in the plane of coordinate 
axes we associate an ordered pair of real numbers (x, x.) determined as 
follows. The line through P drawn parallel to the X,-axis intersects the 
X-axis in a point M, with coordinate x,, and the line through P parallel 
to the X,-axis cuts X, in a point M, with coordinate x, The ordered 
pair of numbers (2, x.) are the coordinates of P in the plane, and the 


sec.2| THE GEOMETRIC CONCEPT ©F A VECTOR 3 


one-to-one correspondence of ordered pairs of numbers with the set of 
points in the plane X, X, is the coordinate system of the two-dimensional 
space consisting of points in the plane. l 

The extension of this representation to points in a three-dimensional 
space is obvious. We take three noncoplanar lines Xj, Xo, X intersecting 
at the common point O. On each of these lines we establish coordinate 
systems, and we associate with each point P an ordered triplet of numbers 
(x1, Xa, 3) determined by the intersection with the axes of three planes 
drawn through P parallel to the coordinate planes X, X», XX3, and XX. 

The coordinate systems just described are called oblique cartesian 
systems. Their construction makes use of the notions of length and 
parallelism of ordinary Euclidean geometry, and the essential feature of 
it is the concept of one-to-one correspondence of points with ordered 
sets of numbers. In the event the coordinate axes X}, Xə, Xa intersect 
at right angles, the coordinate system is said to be orthogonal cartesian, 
or rectangular cartesian. In applications, orthogonal coordinate systems 
are generally used because the expression for the length d of the line 
segment AB joining a pair of points with coordinates A(q,, a, a3) and 
B(b,, bz, b3) has the simple form 


(1.1) d = V(b, — a)’ + (bz — a)? + (bs — as)”. 


This is the familiar formula of Pythagoras. If the coordinate system is 
oblique, the formula for the distance d is somewhat more complicated, 
We will learn in Sec. 9 that one can pass from an orthogonal system of 
coordinates to an oblique system by making a linear transformation of 
coordinates. From this fact and from the structure of formula 1.1, it 
would follow that the length of the line segment joining the points with 
oblique coordinates (£1, £2, £3) and (y1, Yz, Y3) 1S 


3 
where the g,,’s are constants that depend on the coefficients in the above- 
mentioned linear transformation of coordinates. We will be concerned 
in the sequel with a detailed study of quadratic forms appearing under 
the radical in formula 1.2 and with their bearing on metric properties of 


space. 


(1.2) d 


ga — xy; — x); 


2. The Geometric Concept of a Vector 


In the preceding section we recalled the construction of coordinate 
systems in the familiar three-dimensional space where the formula of 
Pythagoras is used to measure distances between pairs of points. Spaces 


4 LINEAR VECTOR SPACES. MATRICES [CHAP. 1 


B b c 


Fig. 3. 


where it is possible to construct a coordinate system such that the length 
of a line segment is given by the formula of Pythagoras are called Euclidean 
spaces. In these spaces the notion of displacement is fundamental. Thus, 
if a point A is moved to a new position B, the displacement from A to B 


= 
can be visualized as directed line segment AB (Fig. 3). If B is displaced 
to a new position C, the resultant displacement can be achieved by moving 
the point A to the position C. These operations can be denoted sym- 
bolically by the equation 


—> — —> 
AB E BCSZAC: 


In the elementary treatment of vector analysis, directed line segments 
are termed vectors, and they are usually denoted by a single letter printed 
in boldface type. Thus the foregoing formula can be written 


(2.1) a+b=c, 


—> —> —> 
where AB = a, BC = b, AC = Cc. 

The rule for the composition of vectors indicated in Fig. 3 was first 
formulated by Stevinus in 1586 in connection with the experimental 
study of laws governing the composition of forces. It is known as the 
parallelogram law of addition. The fact that many entities occurring in 
physics can be represented by directed line segments, whose law of 
composition is symbolized by formula 2.1, is responsible for the usefulness 
of vector analysis in applications. We have here an instance of geometriza- 
tion of physics which had no less important influence on the evolution 
of this subject than the arithmetization of geometry had on the growth 
of mathematical analysis. 

From the idea of a vector as displacement determined by a pair of 
points, we are led to conclude that two vectors are equal if the line seg- 
ments representing them are equal in length and their directions parallel. 
We shall denote the length of the vector a by the symbol |aj. We will 


Sec. 2) THE GEOMETRIC CONCEPT OF A VECTOR 5 


assume that the concept of length is independent of the chosen reference 
frame, so that the length |a| can be calculated (by Pythagorean formula) 
from the coordinates of the initial and terminal points of a. 

The negative of the vector a (written —a) is the vector whose length 
is the same as that of a but whose direction is opposite. We define the 


vector zero (written 0) corresponding to a zero displacement by the 
formula 


a+ (—a) = 0. 
From the geometrical properties of directed line segments we deduce that 
(1) at+b=b+a. 
(ID) (a+b) +ec=a+ (b+ 0). 


(III) If a and b are vectors, there exists a unique vector x such that 
a=b+x. 


We next define the operation of multiplication of vectors by real numbers. 
If « is a real number the symbol «a = aa is a vector whose length is 
|x| ja] and whose direction is the same as that of a if « > 0, opposite to 
aifa< 0. If « = 0, then aa = 0. 

From this definition and from properties of real numbers we conclude 
that 


(IV) (a, + a, )a = aa + Ha 
(V) ala + b) = aa + ab 
D (xa) = (xı%2)a, Ll-ea=a, 


for any real numbers «, and gz. 

We introduce next the definition of scalar product of two vectors, which 
will provide us with a new notation for the length of a vector. 

DEFINITION. The scalar product of two vectors a and b, written a-b, 
is a real number |a| |b| cos (a, b), where cos (a, b) is the cosine of the angle 
between a and b. 

Stated in the language of geometry, a - b is equal to the product of the 
projection of a on b multiplied by the length of b. Thus the length of 
the vector a is given by the positive square root of a-a. We also note 
that a and b are orthogonal if, and only if, a- b = 0. 

From this definition and the properties of real numbers we can easily 
deduce the following theorems. 


(VII) a-a = |aj? > 0, unless a = 0. 
(VII) a-b=b-a. 
(IX) a-(b+c)=a-b+arc. 
(X) a(a-b) = (aa-b), where « is a real number. 


6 LINEAR VECTOR SPACES. MATRICES [CuapP. 1 


3. Linear Vector Spaces. Dimensionality of Space 


We formulate next the definition of linear dependence of a set of vectors 
aj @),..-, a, Which will have an important connection with the concept 
of dimensionality of space. 

Linear Dependence. A set of n vectors ai, @,.-- > ân is called linearly 
dependent if there exist numbers 4, %) ++ +5 &n not all of which are zero, 
such that 

ay + a, H te + oa, = 0. 


If no such numbers exist, the vectors are said to be linearly independent. 
Consider two vectors a and b which are like, or oppositely, directed 
(Fig. 4). Then there exists a number k # 0 such that 


(3.1) b = ka. 
If we set k = —a/B, we can write this equation as 
aa + fb = 0, 


and hence two collinear (or parallel) vectors are linearly dependent since 
neither « nor ĝ is zero. We will say that the totality of vectors ka for 
an arbitrary real k and a # 0 forms a one-dimensional real linear vector 
space. The reason for this terminology is clear since every point on the 
line can be represented by some position vector ka. 

If a and b are two noncollinear vectors, represented by directed line 
segments with common origin O (Fig. 5), any vector c lying m the plane 
of a and b can be represented in the form 


(3.2) c = ma + nb. 


Formula 3.2 follows at once from the rule for addition of vectors and 
from the definition of multiplication of vectors by scalars. Equation 3.2 
can be rewritten in symmetric form to read 


aa + fb + ye = 0, 


which is the condition for linear dependence of the set of three vectors, 
since not all constants in this formula vanish. The formula ma + nb, 
where a and b are two linearly independent vectors and m and n are 
arbitrary real numbers, defines a two-dimensional real linear vector space. 
We see that in a two-dimensional linear vector space a set of three vectors 
is always linearly dependent. 


O a b 


Fig. 4. 


SEC. 3] DIMENSIONALITY OF SPACE y 


Fig. 5. Fig. 6. 


If we start with three noncoplanar vectors a, b,c issuing from the 
common origin O (Fig. 6), we can clearly represent every vector d in 
the form 


(3.3) d = ma + nb + pe, 


from which it follows that among four vectors a, b, c,d there always 
exists a nontrivial relation of the form 


aa + fb + ye + ôd = 0. 


Formula 3.3, for an arbitrary choice of real numbers m, n, p, defines a 
three-dimensional real linear vector space. The terminal points of position 
vectors d sweep out a three-dimensional space of points if m, n, and p are 
allowed to range over the entire set of real numbers. In a three-dimensional 
linear vector space every set of four vectors is linearly dependent. We 
will make use of the connection of the number of linearly independent 
vectors with the dimensionality of space to formulate the concept of 
dimensionality of a linear vector space of n dimensions. 

The vectors a, b, and ¢ in (3.3) are called base or coordinate vectors, 
and the numbers m, n, and p are the measure numbers or components of 
the vector d. Once a set of base vectors is specified, every vector is 
determined uniquely by a triplet of measure numbers. 

A set of three mutually orthogonal vectors in a three-dimensional 
space is obviously linearly independent, and if we choose as our coordinate 
vectors three mutually orthogonal vectors aj, a», a3, each of length 1, 
the resulting set of base vectors is said to be orthonormal. 

We can visualize a set of orthonormal vectors directed along the axes 
of a suitable rectangular cartesian reference frame; in this case every 
vector x has the representation 


X = 21a, + Ua, + VAs, 


8 LINEAR VECTOR SPACES. MATRICES [CHAP. 1 


where (£4, £a, #3) are called the physical components of x, and the terminal 
points of the base vectors a, (i = 1, 2, 3), have the coordinates 


a7 1, G0); 
a2: (0, ilg 0), 
a: (0,0,1). 


We conclude this section by noting the rules for the addition and 
multiplication of vectors when the latter are referred to an orthonormal 
system of base vectors a,, (i = 1, 2, 3). If we have two vectors x and y 
whose components are (£1, %2, £3) and (Y1, Y2, Y3), respectively, then the 
vector x + y has the components (x + Yı, Xə + Yə, X3 + Ys). If « is a 
real number, the components of the vector «x are («2 ,, 4X, a3), From 
the distributive law of scalar multiplication of vectors it follows at once 
that the product of 

X = Xa, + 2a + T383 
and 
Y = YA, + Yoda, + Y383 
is 
X y = UY, + LoY2 + Tayo, 
since a; +a; = 0;,, where ô, = 1 if i =j, and ô, = 0 when i Æ j. This 
follows from the assumed orthonormal nature of the base vectors a,. 
The foregoing formula leads at once to the familiar expression for the 
length |x| of the vector x referred to an orthogonal cartesian reference 
system. Thus 
X-X = g? + 2,7 + z,? 


= |x|’, 


[x] = Vay? + zè + 24%. 


Clearly |x| > 0, unless ma — 20, 


so that 


4. N-Dimensional Spaces 


In a variety of circumstances one encounters a correspondence of sets 
of objects with ordered sets of numbers where the number of independent 
entities exceeds three. For instance, in dealing with the states of gas 
determined by the pressure (p), the volume (v), the temperature (T), and 
the time (ft), one may wish to coordinate these entities with ordered sets 
of four real numbers (21, %2, 73, 4). Here a diagrammatic representation 
of the states of gas by points in the three-dimensional physical space is 


Sec. 4] -DIMENSIONAL SPACES 9 


clearly impossible. However, the essential idea in the concept of coordi- 
nate systems is not a pictorial representation but the one-to-one reciprocal 
correspondence of objects with sets of numbers. The notion of distance 
between pairs of arbitrary points is, likewise, irrelevant. Indeed, the 
idea of distance becomes devoid of geometric sense even in the familiar 
representation of the states of gas [the pressure (p) and the volume (v)] 
by points in the cartesian pr-plane. It is manifestly absurd to speak 
of the distance between two states characterized by ordered pairs of 
numbers (p, v). 

The utility of analytic treatment of physical problems is so great that 
we are naturally led to form the concept of spaces of higher dimensions 
by utilizing the idea of one-to-one correspondence between the sets of 
numbers and objects. The “objects” here might be of quite diverse sorts. 
In certain situations they might be pressures, volumes, and temperatures; 
in others they might be the amounts of electrical charge and the complex 
potentials produced by the motion of such charge, and so on. 

We define? a space (or manifold) of N dimensions as any set of objects 
that can be placed in a one-to-one correspondence with the totality of 


ordered sets of N (real or complex) numbers z, tə, .. . , &y such that 
|z; — A,| < k; CE N): 
where A,,..., Ay are constants and the k,, kə, ..., ky are real numbers. 


The inequalities in this definition specify the range of variation of the 
numbers «,. If the numbers 2; are real, the N-dimensional space is real, 
and we can write the inequalities in the form 


a [Š Tı <4, By Ste Da tae ty < ay S be. 


Some of the equality signs may be omitted, and we may have for the 
range of variables z+, for example, 0 < % < ©. 

We denote the space of N dimensions by the symbol Vy, and we use 
the term “points” to mean “objects.” 

Any particular one-to-one association of the points with the ordered sets 


of numbers (x1, Zo, +++» % y) is called a coordinate system, and the numbers 
Uy, Zo... Zy are termed the coordinates of points in the coordinate 
system. , 


There is no implication in these definitions that the concept of distance 
between pairs of points has any meaning. If a rule is specified for the 
measurement of the distance between points, the space Vy is called metric. 
For the time being we will not assume that our spaces are metrized. 


2 Compare O. Veblen, Invariants of Quadratic Differential Forms, p. 13. In speaking 
of one-to-one correspondence we always have in mind one-to-one reciprocal corre- 


spondence. 


10 LINEAR VECTOR SPACES. MATRICES [CHaP. 1 


A set of equations of the form 
(4.1) x, = XY, Ye, aN) (i = 12 ND 


in which the functions x; are single-valued and are such that, in the region 
under consideration, they yield N single-valued solutions 

Yi = Yl, Ta, +++, Zy), 
defines a transformation of coordinates. 

We will defer a discussion of the general functional transformation 
(4.1) to Chapter 2. In the remainder of this chapter we are concerned 
with a detailed study of an important case of linear (or affine) trans- 
formations of coordinates 


N 
Yi = D452 5, (f= 1,...,N), 
j=1 
and with the bearing of such transformations on linear vector spaces. 


5. Linear Vector Spaces of n Dimensions 


A sketch of the rudiments of vector analysis in Sec. 2, based on the 
concept of directed displacement, contained a set of ten theorems em- 
bodied in the formulas identified by the Roman numerals. These theorems 
can be taken as a point of departure in the generalization of the concept 
of a vector in the n-dimensional space since the idea of directed displace- 
ment and length become devoid of familiar sense whenever n exceeds 
three. Accordingly, we will postulate that there exist points in the 
n-dimensional space and that 


A. Every two points in the real n-dimensional space determine an entity 
which we call a vector. We denote this entity by the symbol a. 

B. Every two vectors a and b have a sum a + b which obeys laws I, I, 
and III stated in Sec. 2. 

It follows from the third of these laws that the operation of subtraction 
of vectors is unique and that there exists a vector 0 such that a + 0 = a 
for every vector a. 

C. For every real number « and vector a there exists a vector wa = ax 
obeying laws IV, V, and VI of Sec. 2. 


We retain the definition of linear dependence of the set of n vectors, 
with respect to the field of real numbers «,,...,«,, and take as our 
axiom of dimensionality the assumption that 

D. There exist n linearly independent vectors in the n-dimensional space, 
but every set of n + 1 vectors is linearly dependent. 


Sec. 5} LINEAR VECTOR SPACES OF n DIMENSIONS 11 


This axiom implies that every vector x can be represented in the form 
(5.1) X = a, a, + oa, +°°* + Ani 


where aj, @,...,a, is any set of n linearly independent vectors. We 
say that the totality of vectors determined by formula 5.1, where the «, 
are arbitrary real numbers, constitutes a real linear vector space of n 
dimensions. 

To give a meaning to the concepts of length and orthogonality of 
vectors we need a postulate 

E. With every two vectors a and b we can associate a number a. b, 
called their scalar product, which obeys laws VII, VIII, IX, and X of 
ec. 2. 

At this stage we are not concerned with the nature of the formula used 
to calculate the number a- b. Suffice to say that the properties embodied 
in the laws of scalar multiplication lead to a definite rule for computing 
a-b once a coordinate system is introduced for the specification of 
coordinates of points determining the vectors. 

A vector satisfying postulates A through E is said to be defined in the 
n-dimensional Euclidean space E,,. 

We shall use the language of Euclidean geometry and will mean by the 
length of the vector a the positive square root of the scalar product of 
the vector a by itself. Thus the length lal=Va-a. If |a| = 1 the 
vector a is called a unit vector. Two vectors a and b are said to be 
orthogonal whenever a » b = 0. 

We proceed to demonstrate that every set of m linearly independent 
vectors in E, (m < n) can be orthogonalized. This means that from 
a given set of m linearly independent vectors Xı, Xz, » . - , Xm We can 
construct a set of vectors a;,..., Am Such that a; + a; = 0 whenever i # j. 
Moreover, it is possible to choose the vectors a; so that they are unit 
vectors. 

Proof. We assume that the set of vectors {x,}, (i = 1,...,m) is 
linearly independent. Thus the equation 


62) 1X1 + Xa + °°* + Cnm = 0 


can be satisfied only by choosing Cı = Cz =*** = Cm = 0. It follows 
that x, # 0, for, if it were zero, the numbers 


a= 1, Cy = Cg = °° Hm = O 


would satisfy (5.2) and hence the vectors would be linearly dependent, 
which is contrary to our hypothesis. Denote by a, the product of x, by 


12 LINEAR VECTOR SPACES. MATRICES [CAET] 


the reciprocal of its length so that 


Clearly a; +a, = 1, so that a; is a unit vector. 
The set of vectors 
A a AXA 


is obviously a linearly independent set. Consider next the vector 
a; = X, — (X: + a;)a). 
The product of this vector by a, vanishes since 
X ° a, — (X,°a,)a,°a, = 0. 


Thus a,’ is orthogonal to a, and a,’/|a,'| = a, is a unit vector. 
The set of vectors 
Oa Xen, | ee 


is linearly independent, and we can define the vector a3’ by the formula 


te 
a; = X; — (X; ° a,)a, — (X; ° a2)az, 
which is orthogonal to both a, and a,. The vector a; = a,'/|a;']| is a unit 
vector, and the set - 
cr a 


is a linearly independent set of vectors. 
A repetition of this procedure will yield a set of m linearly independent 
unit vectors 


(5.3) a ae 


each of which is expressed in terms of the x; The set of orthogonal 
unit vectors (5.3) is called an orthonormal set. 

If m = n, the set of orthonormal vectors aj, az, . . . , a, is called complete 
because every vector x in E, can be represented in the form 


(5.4) x= yay -+ Xafa + ey + Anan. 


By analogy with the three-dimensional case, a complete set of ortho- 
normal vectors can be taken as a set of coordinate vectors oriented along 
the axes of the n-dimensional orthogonal cartesian reference frame. The 


Sec. 5] LINEAR VECTOR SPACES OF n DIMENSIONS 13 


terminal points of these vectors then have the coordinates 


MeO its cis; 
leg ieee 3 
oomi 0 
02070; si 
The constants i, %,...,, in (5.4) are called the components of the 


vector x. Multiplying (5.4) scalarly by aj, a,...,a, in turn, and 
remembering that? a;- a; = 6;;, we obtain 
a,°X = %, as ° X = Os, aes a,°X =o,. 
Thus the vector x can be represented in the form 
(5.5) x = (a, :x)a, + (a.°x)a, + °°" + (a, ° x)a,. 
If we introduce the notation a, + x = x, equation 5.5 assumes the form 
X = 2a, + Vea, Hte + Tran 
Using the distributive property of scalar multiplication, we get 
(5.6) ae + a" +°°> + 2,%, 
so that 
ba — Va? + te = + 2,,*. 
This is the formula of Pythagoras in E,,. 
If y = yay + Ya + °° + Ynah, then 
KY = UY, + Toye + °° + Bn 


This formula has the same structure as the expression for the scalar 
product of two vectors in ordinary three-dimensional space of Euclidean 


geometry. 
We note that in an orthogonal cartesian reference frame a vector x is 


uniquely determined by an n-tuple ofi numbers alai. %,...,2,). This 
property is taken by some authors as the definition of a vector in E,,. 
For the sum of two vectors x and y, with components 
X: (x, Losses Ta). 
y: (Yis Yose.. Yr): 
3 The symbol 6;;, the Kronecker delta, means 
6,=1, ifi=j, 
=0, ifi#j. 


14 LINEAR VECTOR SPACES. MATRICES — [Cuap. 1 
we have the formula 

x+y: @ ert ve te 
and for the product of x by the scalar «, 


GM? (Cota ae). 
The formula 
X'y = 2Y + Vee + °°? + UY 


serves to define metric properties of vectors in E,,. 

The passage from an orthonormal set of vectors a, to any other set of 
base vectors is accomplished by subjecting the elements of the n-tuple to 
a suitable linear transformation. In essence the approach to vectors by 
way of n-tuples of numbers reduces the study of vectors to the study of 
algebraic properties of linear transformations. In this book we prefer 
to stress the geometric concepts that underlie the idea of a vector and not 
submerge them in a purely algebraic formalism. 


6. Complex Linear Vector Spaces 


The considerations of Sec. 5 can be easily extended to the field of 
complex numbers. 

In a complex n-dimensional linear vector space the vector x is determined 
by the ordered set of n complex numbers (x, %,...,2,), the elements 
x, of which are the components of x. 

We define the sum x + y of two vectors 


X: (a; Baynes Ce). 


y: (Yi, Ys. « ele) 
by the rule 


xy: (z1 + Yi Fe + Yo, -s Tn + Yn) 


and the product «x by 
OX: (Gy, On... . 5 EN): 


It is customary to define the scalar product x - y by the formula 


(6.1) ON) EeYis 
1 


t= 


where a bar over x; denotes the conjugate of the complex number z,. 


Sec. 6] COMPLEX LINEAR VECTOR SPACES 15 
We note that 


(6.2) APO NE D YX, 
i=1 

so that 

(6.3) x*-y=y-x, 


since the conjugate of the sum is the sum of the conjugates and the 
conjugate of the product is equal to the product of conjugates. 

Formula 6.1 is adopted for the calculation of the scalar product in 
order to ensure that 


be a real number. It clearly specializes to (5.6) when the numbers 2; 
are real. 

The vectors x, y are said to be orthogonal if x-y = 0. As regards the 
notion of linear independence, we retain the definition given in Sec. 3 with 
the understanding that the coefficients «; now belong to the field of com- 
plex numbers. 

Problems 


1. If we start with the definition of a vector x as an n-tuple of n real or complex 
numbers (2, Xə, . . -, Zn), and use for the definition of sum and product the 
formulas 

x Hy: (x + Yis. -es Tn + Yn), 
ki (Küps -s kin) 


IMs 


x-y= 
i 


EY is 
1 
then 
(x +y)°z=x-+z+y°zZ, 
x-(y tz) =x-y+x-°Z, 
(kx)-y = K(x-y), 
x + (ky) = k(x y). 


2. Prove that, if a", a!?),..., a™ is a set of n linearly independent vectors in 
a complex n-dimensional vector space, then the only vector x orthogonal to each 
of the vectors a‘) is the zero vector. 

3. Prove that a set of mutually orthogonal nonzero vectors is always linearly 


independent. py * 
4. Let the set of vectors a in En: (Oran as a), i = 1,2,...,m, be 
linearly dependent, and suppose that r of them, a, a®,..., a", r <n, are 


linearly independent. Show that every vector x that is orthogonal to this set of r 
linearly independent vectors forming the subset of E, is also orthogonal to the 
remaining n — r vectors in the given set. 


16 LINEAR VECTOR SPACES. MATRICES [CHaP. 1 


7. Summation Convention. Review of Determinants 


It is clear from the developments in preceding sections that the linear 
forms and matrices associated with them enter prominently in the study 
of vectors in the n-dimensional manifolds. Since such forms will occur 
frequently throughout the remainder of this chapter, it is desirable to 
introduce a compact abridged notation and to rewrite with its aid certain 
familiar results from the theory of determinants. 

From now on we shall adhere to the following summation convention. 
If in some expression a certain index occurs twice, we shall mean that this 
expression is summed with respect to that index for all admissible values 

4 
of the index. Thus the linear form > a,x; has the index i occurring in it 


il 
twice; we will omit the summation symbol 2 and write a,x; to mean 


A,X, + Apt, + ase, + a,x. Of course, the range of admissible values of 
the index, 1 to 4 in this case, must be specified. If the symbol i has the 
range of values 1 to 3 and j ranges from | to 4, the expression 


(LIN) a, t (i = 1, 2, 3), 
1, 2, 3, 4), 


~ 
<. 
I 


represents three linear forms 


atı + AyXo + Aita + Ay4%q, 
(7.2) azı + a23 F Ag3%y + AngXq; 


Q3)X + Azt + A33ť3 + Az4T4. 


In expression 7.1 the index i is the identifying index. It denotes one of the 
forms in (7.2), depending on the chosen value of i. The index j, however, 
since it occurs twice, is the summation index. The summation (or dummy) 
index can be changed at will. Thus (7.1) can be written in the form a;,27, 
if k has the same range of values as j. The summation index is analogous 
to a variable of integration in a definite integral, which also can be 
changed at will. 

Unless a statement to the contrary is made, we will assume that the 
summation and the identifying indices have the ranges of values from 
l ton. Thus a,x; will represent a linear form 


aiti + dta +°*> + a Un 


Although in the last term of this expression the letter n occurs twice, 
it does not represent the sum, since n here has a fixed value. In order to 
avoid ambiguity, or when we want to suspend the summation convention, 


SEC. 7] REVIEW OF DETERMINANTS 17 


we may enclose the index in parentheses. Thus we can write the linear 
form as 
atı Fr A23 SP oo == Ainin 


$ e a 


i=1j= 
a;jX;yJ; represents a bilinear form containing n? terms, whereas a;;;y 
represents n? sums of the type 


n n 
The quadratic form > > a,,x,, will be written a,,7,7;. An expression 
1 


AQ, + aiao H °° * F ainanw 


since each of the identifying, or free, indices i and k can have values 
from | to n. We will not trouble to enclose the indices in parentheses 
when the context makes it clear (as in the above expression) that such 
indices have fixed values. If, however, we wish to discuss a particular 
term in this sum we will write @;(;)4(;),- 

Frequently, it is convenient to identify the different symbols by using 
superscripts rather than subscripts. For instance, we may write the 


sequence of terms z!, x?, . . . , x”, where the superscripts are not the powers 
of the variable but the identifying indices. The typical term in this 
sequence is z*, (= 1,2,...,). A linear form in the x*, with the co- 


efficients a,, will be written as a,x’. A bilinear form, with the coefficients 
a‘), in the variables x, and y; will be written as a‘’x,y;. 


A determinant 
Ay Aiz «++ Uy 


Se), A ee o 4 4 (8 6b de 


Any Ang +++ Ann 


whose elements are a, will be written, as is customary, |a,;|. If the 
elements of this determinant are denoted by a,', where the superscript / 
indicates the row and the subscript j the column in which this element 
appears, we will write the determinant as |a;*|. Thus 


ae as ie 

2 2 

jaj =|% %2 a, 
dy dp" qe 


For the multiplication of two determinants |a;*| and |b;*| we have the 
familiar rule: ; ; Di 
la;*| + |b;*] = lc;*l, 


where c, = a,'b,;*. If we deal with determinants |a,;| and |b;;|, then the 
element c, in the ith row and the jth column of the product of |a;;| and 


[b,;| is Ci = Ain Dy;. 


18 LINEAR VECTOR SPACES. MATRICES [CHaP. 1 


The cofactor of the element a,’ in |a,*| is denoted by 4,’. If we write 
the Kronecker delta as ô;*, where 


6(=1, ifi=j, 
=0, fix), 


then for the expansion of |a,*| in terms of cofactors we have the following 
formulas: 


(7.3) a;‘A,? = ad,', 
(7.4) a,'Af = ad;, 


where a = |a,’|. These formulas include the familiar simple Laplace 
developments of |a;‘|. The first of these then represents the expansion 
in terms of the elements of the ith row; the second, in terms of the elements 
of the jth column of |a,'|. 

If the elements of the determinant a are denoted by a,,;, we shall write 
the cofactor of a, as A. Simple Laplace developments corresponding 
to (7.3) and (7.4) assume the forms 


Q(;);A(3 = a and QA = a. 


We can derive Cramer’s rule for the solution of the system of n linear 
equations 


(7.5) ajzi= bi, (i,fj=1,...,n), 


in n unknowns x', where |a,'| # 0, as follows: Multiply both sides of 
equations in (7.5) by A,*, and sum with respect to i. This yields 


a, Ape =p Ae 
By (7.4) this reduces to 


go 7 — Dae 


or 
ax = bi AF. 
Thus 
tak 
(7.6) ote BTA, , 
a 


Frequently, the cofactor of the element a,; in |a,;| is denoted by A‘, 
so that the Laplace developments (7.3) and (7.4) assume the forms 


QA“ = a, 
yA = a, 


Sec. 8) LINEAR TRANSFORMATIONS AND MATRICES 19 


To gain familiarity with this notation, the reader is advised to derive 
Cramer’s rule when the system of linear equations is written in the form 
a,;x’ =-b,;. He will also prove that, if a,b,’ = 6,', then |a,*| = 1/|b,’|. 

We will return to the subject of determinants in Sec. 41, where a 
different notation permits us to eliminate references to rows and columns 
of the determinant and enables us to write it in terms of its elements, 
without reference to cofactors. 

Problems 


1. Write out in full the following expressions: 


(a) 6;'a’. (b) 6,;'x), (c) abie ta (dd) aa" 
of; l , dat l 
(e) eee (Ea (g) a = a bi. (h) angy. 
: ay* ayk . | | 
(O gi = 3 Gal’ (j) auga. (k) 6,;5%*. 


The symbols 6,;*, 6;;, and 6*/ all denote the Kronecker deltas. 
2. Verify that (7.6) is the solution of (7.5). 


8. Linear Transformations and Matrices 
A set of relations of the form 


(8.1) wi ane, tty 1, 5452); 


where the a,,’s are constants, is called a linear homogeneous transformation 
of the set of variables x, into a set z,. We shall suppose that the trans- 
formation 8.1 is nonsingular, so that the set of linear equations 8.1 
can be solved for the x, in terms of the ,. This implies that the deter- 
minant |a,;| of the coefficients of ~,’s is different from zero. 

The solution of (8.1) for the x’s yields 


(8.2) a= a xj’, 
a 
where A,, is the cofactor of the element a,; in |a;| = a. 
The set of equations 8.1 can be interpreted in two essentially different 


ways: 


(a) The quantities x, may be regarded as components of a vector x: 
(a, .. .. sq), and the numbers 2,’ as components of another vector 
x’: (£1, %9',...,%,), Where both x and x’ are referred to a reference 


frame with the system of base factors a,; in this case we think of equations 
8.1 as representing a transformation of the vector x into another vector x’. 


20 LINEAR VECTOR SPACES. MATRICES [Cuap. 1 


(b) The two sets of numbers (2, %,...,%,) and (23, %',-.-, x, ) can 
be regarded as components of the same vector x when x is referred to two 
different sets of cartesian reference frames determined by the base vectors 
aj 4),...,a, and a,’,a,,...,a,; in this event equations 8.1 give a 
transformation of coordinate axes. 


Before proceeding to a specific discussion of these two interpretations 
of the set of equations 8.1, it is necessary to review the operations with 
matrices. 

An array of mn numbers, arranged in m rows and n columns, is called 
an m x n matrix. We denote the matrix formed from the elements a;; 
(or a;*) by 


1 1 1 
ai Aye Ain ay ao a, 

2 2 

(a) = Ge, Ane Gan or (a) = a, ay Ane 
m m m 

ami Ams Amn ay ae a, 


We shall also write the symbol A for the matrix (a;;). We shall say that the 
matrix A = (a;,) is equal to the matrix B = (b,;) if, and only if, a;; = b;; 
for each i and j. That is, if A = B the elements in the corresponding 
rows and columns of the matrices must be equal. 

By the sum A + B of two matrices A = (a,,;) and B = (6,,) of the same 
type, that is, containing the same number of rows and columns, .we mean 
the matrix 


A+ B= (a; + by). 


If we have an m X n matrix A and an n x p matrix B, we can define 
the product of matrices A and B, written AB, by the formula 


(8.3) AB = (4;;b;,). 


Thus the product AB is an m X p matrix; we can multiply two matrices 
only if the number of columns in the first factor is equal to the number 
of rows in the second. 

For the most part we shall deal with square matrices, that is, matrices 
containing an equal number of rows and columns. 

A matrix all of whose elements are zero is called a zero matrix. It is 
denoted by the symbol O. 

We note two peculiarities of matrix multiplication. From the definition 
8.3 it follows that, if A and B are two n x n matrices, then AB is not 
necessarily equal to BA. 


Sec. 8] LINEAR TRANSFORMATIONS AND MATRICES 21 


For example, if 


a J F 0 
A= and) B= ' 
1 0 0 1 


then . 


0 1 0 —-1 
Aba : whereas BA = ' 
—1 0 1 0 


Thus the product of matrices, in general, is not commutative. However, 
if we have two matrices of order n, which contain zero elements everywhere 
except possibly along the diagonal, then they are commutative, and obey 
the simple law of multiplication. 


A, 0 -:- 0 MH Oc O Ayer 0 aaa 0 
bre Ag cae Oia oe 0 0 Apple 0 
ONi a oe O O oee i 0 0 te Dida 


Such matrices are called diagonal matrices. The diagonal matrices will 
be found to be of considerable importance in what follows. 
A particular diagonal matrix 


otor a | 
is called the identity matrix. We note that, if A is any matrix, then 
AI = IA = A. 


We also observe that the product of two matrices may vanish when 
neither of the matrices is a zero matrix. 


RO 000 000 
Thus, if 4=|0 0 O|andB=]ļ|O0 0 0|, then AB =]|0 0 O}. 
000 1 0 0 0 0 0 


22 LINEAR VECTOR SPACES. MATRICES [Cuap. 1 


However, the determinant |AB| of the product of two square matrices 
is precisely equal to the product of the determinants |A| and |B] of the 
matrices A and B. This follows at once from the observation that the 
law of formation of the element in the ith row and kth column of 
the product of two determinants is identical with the corresponding rule 
for the product of two matrices. We call ann x n matrix whose determi- 
nant is zero a singular matrix. 

Finally, we define the multiplication of the matrix A = (a,,;) by the 
number k, written KA, as the matrix each of whose elements is multiplied 
by k. Thus kA = (ka;,). 

As an exercise the reader will verify the following theorems, which 
follow directly from the definitions given previously. 


(D A+B=B+A. 

(11) (A + B)+C=A+(B4+C). 
(IID) (A + B)C = AC + BC. 
(IV) C(A + B) = CA+CB. 


The notation just developed permits us to write the system of equations 
8.1 in the form of a vector equation 


(8.4) Bb 


where A = (a;;) and where we agree to interpret x either as a column 
matrix 


di 
Ty x, 0 
or a square matrix Tz 
a 0 0 0 
x 


n 


The inverse transformation 8.2 can be written 
(8.5) x = AY’, 
where 
Ay An... Am 


IA} JAI [Al 

Aj Aza ` Ane 
8.6 eer lee 
(8.6) A |A] 1A] [Al |> 

Ain Aen Ags 


Sec. 8] LINEAR TRANSFORMATIONS AND MATRICES 23 


and the 4,,’s are the cofactors of the elements a; in the determinant | Al. 

The matrix A-! is called the inverse of the matrix A, and it is defined 
for any nonsingular matrix A. From definition 8.6 it follows that the 
matrices A and A~! are related by the formulas A 


AA“ = I, ATA =I], 


where I is the identity matrix. For, AA~* = (a,A4,/|Al) and a4. = 
ô; |4|. The identity matrix / corresponds to an identity transformation 
a,/ = 2,; this transformation when written in the matrix form (8.4) 
appears as x’ = Ix, or 


x =X 
We call the matrix 
Ai a2 ant 
A’ = |a Gee anj, 
Ain An Ann 


Ay, i2 Ain 
A = | än z Gan}, 
Ant ane Ann 


the transpose of A. 
Using the definition of transpose and the laws of addition and multi- 
plication of matrices it is easy to show that 


(V) (A + BY! =A +B. 
(VI) (kAy iA 
(VID (AB) = B'A’. (Note order.) 


If A is nonsingular, then the matric equations 
AX=I and XA=I 


have unique solutions X = A, as can be immediately verified by multi- 
plying them by 4~™ on both sides and noting that 


AA = AAD =]. 


24 LINEAR VECTOR SPACES. MATRICES — [CuapP. 1 
If we take A-14 = AA“ and form the transpose, we get 
A'(A>y SA 
Multiplying by (4) on the left, we obtain 
(A'A (AY = (AJAY A' 
(AY = (AAAY 
=. 
Thus 
CC 
We can also readily show that 
(AB)? = B Az. (Note order.) 
If we have two successive linear transformations 


oa, and CO = ee Gye), 
the direct transformation from the variables x, to the variables x,” is 
Le" = byak Gh k= |, sae: 


this is called the product transformation. Writing these transformations 
in matrix notation yields 


Ko AX and <= ER 
so that 
x” = BAX. 


Since the product BA, in general, is not equal to AB, we see that the order 
in which the transformations are performed is not immaterial. 

It should be observed that the matrix A in the equation x’ = Ax can 
be interpreted as an operator which converts a vector x into another 
vector x’. Because of the properties 


A(kx) = kAx 
and 
A(x + y) = Ax + Ay, 


where k is any scalar, A is frequently called a linear vector operator or 
linear vector function. It can be viewed as an apparatus for the manu- 
facture of a new vector from a given vector. We shall expound these 
points in greater detail by considering a number of examples of the uses 
of matrices in several situations familiar from analytic geometry and 
elementary vector analysis. 


Sec. 9] TRANSFORMATIONS IN EUCLIDEAN 3-SPACE 25 


9. Linear Transformations in Euclidean 3-Space 


Let us refer our Euclidean 3-space (£,) to a system of coordinates with 
base vectors a“, a), a), linearly independent but not necessarily orthog- 
onal. Then any vector x can be represented in the form 
(9.1) a (j = 1, 2, 3), 


where the x, are appropriate real measure numbers. If we introduce 
a real linear transformation 


(9.2) X = aut; with |a; | 4 0, G7 =n); 
or 
(9.3) x’ = Ax, 


we can interpret the resulting vector x’ as a deformed vector produced 
by the deformation of space which is characterized by the operator A. 
In general, the length of the vector x’ will be different from that of x, 
and its orientation relative to our fixed reference frame will differ from 
the orientation of the vector x. 

Obviously there are infinitely many reference frames that may be 
imbedded in our space, and in each frame the vector x is characterized 
uniquely by a triplet of numbers. Let us inquire: What is the form of 
the transformation giving the same deformation of space as that charac- 
terized by the matrix A, when the vector x is referred to a new frame of 
reference in which the base vectors a”, a), a® are related to the old 
base vectors a, a‘?), a) by the formulas 


(9.4) al) = b,a”? 


We shall suppose that the matrix (6;;) = B is nonsingular and denote 
the components of x relative to the new system by (&), £2, £3), so that 


(9.5) = ga". 


If we insert in (9.5) the expressions 9.4 for the base vectors æt” in 
terms of a‘), we obtain 


(9.6) x = é,b,a". 


A comparison of this equation with (9.1) yields the connection between 
the components &, and x,, namely, 


(9.7) x, = byki 


We note that the matrix B in the transformation 9.4 of base vectors 
a) differs from the matrix B’ in the transformation 9.7 of components 


26 LINEAR VECTOR SPACES. MATRICES [CuapP. 1 


of the vector x in that the rows and columns in these matrices are inter- 
changed. Thus the matrix B’ is the transpose of the matrix B. We write 
(9.7) in the form 


(9.8) x = B’E. 

The solution of (9.8) for & is given by 

(9.9) = x. 

To simplify writing we denote (B’)~' by C, so that (9.9) becomes 
(9.10) —E = Cx, 

where 

(9.11) C = (BY. 


Formula 9.10 permits us to calculate the components of the vector x 
when it is referred to a new system of base vectors a"), determined by 
(9.4). Consequently the components &,’, &’, č of x’, relative to the 
reference frame with base vectors a‘, are given by 


(9.12) Ẹ' = Cx’, 


and the question of the expression (in the new frame) for the deformation 
of space characterized by (9.3) amounts to finding the relation connecting 
the components &,, &, £ with ¢,’, é>, é. The substitution from (9.3) 
in (9.12) gives 

= eA 
and, since by (9.10) 

x= C7, 
we get the desired relation 


(9.13) E' = CAC“E, 


The transformation determined by the matrix S = CAC~ is called similar 
to the transformation produced by A because formulas 9.13 and 9.3 
characterize the same deformation of space relative to two different 
reference frames. 

If we recall the definition (9.11), we can write (9.13) in the form 


(9.14) E = (B') ABE, 


which brings into explicit evidence the matrices A and B characterizing, 
respectively, the deformation of space and the transformation of base 
vectors. We note that the determinants of all similar transformations 
are equal. An important special case of the transformation 9.2, corre- 
sponding to the rotation of the vector x to a new position, is discussed 
in the next section. 


Sec. 10] ORTHOGONAL TRANSFORMATION IN £; 2a 


10. Orthogonal Transformation in £, 


Let us suppose that the base vectors a), a, a) in Sec. 9 are orthogonal 
unit vectors, so that the measure numbers x; in (9.1) are the, physical 
components of x. Then the square of the length of the vector x is given 
by the formula 


pIE == ana, (@ = 1, 2, 3). 


Let us inquire about restrictions that must be imposed on the matrix 
A in (9.3) if the length of x is to be unchanged by the transformation 92 
This restriction demands that 


(10.1) LiL; = Ti. 
Substituting in (10.1) from (9.2), we obtain 


(a, ;2;)(@x%) = TTi (ij: Kel 2 Sh 
or 


(10.2) Aijai t.y = ÖT Ek 
since 
Ô pLEy = Lplp = U;X;. 
Equating the coefficients of like products in (10.2), we obtain six equations 
ae e Cht ar a = |, 
d + Q2? + az = l, 
a? + dos” + la = 1, 
Ayo1g + 22403 + A32433 = 0, 
dial + A2342 + 433431 = 0, 
Ay1đ12 + A222 + A310323 = 0, 
or 
(10.3) Ajj, = Öje 
These equations are consequences of the hypothesis that the length of x 
remains invariant. The determinant of the matrix in (10.3) has the value 
(10.4) [ajanl = 1. 


Since the value of the determinant |a,,| is unchanged when its rows 
and columns are interchanged, we see from the rule for multiplication of 
determinants (Sec. 7) that 


(diail = |a;;|° la;;| = |A|?. 


28 LINEAR VECTOR SPACES. MATRICES [CHaP. 1 


Thus (10.4) yields the result that the square of the determinant |a,,| in 
(9.2) has the value | whenever the length of the vector is unchanged by 
the transformation. We conclude that |A| = +1. The case when 
|A| = +1 corresponds to the transformation of rotation of space relative 
to fixed axes. The circumstance when |A| = —1 corresponds to the 
transformation of reflection (say, x, = —2,, T9 = —T,, 4 = —23) or a 
reflection followed by a rotation. 
A linear transformation 


(10.5) Ly = A,X ;, 


in which a,,a4,, = Ôp, is called an orthogonal transformation. It is called 
the transformation of rotation when |a „| = +1. If we denote by A’ 
the transpose of A in (10.5), we can write the orthogonality conditions 
(10.3) in the form 

A A=. 


Multiplying this equation on the right by A~, we get 
(10.6) A’ = A, 


so that in an orthogonal transformation the inverse matrix A` is equal to 
the transpose A' of A when the base vectors are orthonormal. 
It follows that, if we write equations 10.5 in the form 


X = AK 
then 

XSA x, 
and by virtue of (10.6) 

x = 
or 
(10.7) %, = d;,X;'. 


11. Linear Transformations in n-Dimensional Euclidean Spaces 


Our discussion of linear transformations in Euclidean 3-space can be 
immediately extended to n-dimensional manifolds E, referred to a coor- 
dinate system such that the length of the vector x is determined from 
formula 5.6. 

We introduce n-orthonormal vectors, 


a OLOR ea a 0): 
acka O OO); 


a: (0,0,0,..., 1), 


Sec. 11] LINEAR TRANSFORMATIONS 29 
and represent any vector x: (X1, tə... , Xn) in the form (cf. equation 9.1) 
(11.1) RE ema (fj Agee eet) 


A linear transformation of components, corresponding to equation 
9.2, is 


(11.2) fo 20. Cj erat): 
We can write it in matrix notation as 
(1153) x = Ax, 


where A = (a;,). 
We suppose that |A] # 0, and denote the solution of (11.3) by 


x = Aly’, 
where 
AT! = Azi) s 
|A| 


The 4,;s denote the cofactors of the elements a,; in |4]. 

Just as was done in the three-dimensional case, we can show that the 
product of transformations x’ = Ax and xX = Bx' is x” = BAx. We 
can still use the suggestive language of geometry and speak of the set of 
equations 11.3 as representing the deformation of space £,, and consider 
that the transformation of the form 


(11.4) x’ = CACx 


represents the same deformation of space as that characterized by the 
matrix A in (11.3). The matrices A and CA C~ are still termed similar. 
By analogy with the three-dimensional case, a real linear transformation 
that leaves the length of every real vector x: (x,,...,%,) invariant is 
called orthogonal. From computations of Sec. 10 it is obvious that the 
coefficients a; in an orthogonal transformation (11.2) satisfy the relations 


(11.5) Aji, = Öit 


and that the matrix A = (a,;) of an orthogonal transformation is related 
to its inverse by the formula A’ = A™. The condition (11.5) is both 
necessary and sufficient for a transformation to be orthogonal. Since 
the transpose of the matrix of an orthogonal transformation is equal to 
its inverse, we deduce that a;,a,; = 9jx- 

Any matrix satisfying the orthogonality conditions (11.5) is called 
orthogonal. The square of the determinant of such a matrix has the 


value 1. 


30 LINEAR VECTOR SPACES. MATRICES [CHaP. 1 


As in the three-dimensional case we introduce a matrix B = (0,;) 
defining a transformation of the base vectors a‘*) into a new set of base 
vectors a‘) in accordance with the formula 


(11.6) AD aeeb al; (y= en): 
thenG@ = (8 xs 

If the vectors a‘? are orthonormal and the matrix B orthogonal, the 
new set of vectors æ!” will obviously be orthonormal. Whenever |b| = 1, 


we shall speak of (11.6) as representing a rotation of base vectors in E,- 
We now raise the question: Is it possible to find a matrix C such that 
the matrix CAC~ has the diagonal form 


A, 9 0 
A=|0 A 0 |? 
0 0 A 


This means that relative to a suitable reference frame the deformation 
of space, characterized by (11.2), assumes the form 


(11.7) En = AEn ay = A Ea eens E = Aes 


the é,'’s being the components of x’ and the &,’s of x in the new coordinate 
system. 

In the language of transformations in E}, equations 11.7 state that for 
a suitably chosen reference frame the linear deformation of space is 
equivalent to simple extensions or contractions along the coordinate axes. 
Clearly the possibility of such reduction depends on the nature of 
coefficients a, in (11.2). 

A detailed discussion of the problem of reduction of matrices to various 
canonical forms is involved. In the following sections we treat only 
those cases that occur most frequently in applications, referring the 
reader for an exhaustive treatment to standard treatises on higher algebra. 


12. Reduction of Matrices to the Diagonal Form 


We return to the problem posed in Sec. 11, concerning the possibility 
of finding a nonsingular matrix C such that an arbitrary matrix A can 
be reduced to the diagonal form A by means of the similitude trans- 
formation CAC. From the point of view of linear transformation of 


SEc. 12] REDUCTION OF MATRICES 31 


space, this problem is equivalent to determining the base system a), 
(i = 1,... , n), relative to which the transformation 


t; = 4;;%; 


assumes the form [see (11.7)] 
‘So = A fy = Af, sey fe = AS 


We write C~! = S, and seek a solution of the matric equation 


(12.1) SAS = A, 
or 
(12.2) AS SSN 
where A = (a) and 
An 0 0 
he o 0 
omo eo 


The matric equation 12.2 is equivalent to the system of linear equations 


(12.3) GiSie = Sader (no sum on k), joe sleeper, 3), 


where 
S11 Sik Sin 
Sr S21 Sox San 
a a 


If in (12.3) we set i = 1, 2,...,n, and fix k, we obtain a system of n 
equations containing the elements C a . - -saplappearing in the kth 
column of S. The elements (Siw S2% ++ +> Spx) can be viewed as com- 
ponents of the vector s*), so that the determination of the matrix S is 
equivalent to finding a set of n vectors s™®, (k = 1, T n), whose com- 
ponents satisfy equations 12.3. Accordingly we write equation 12.3 in 
the form 


(12.4) As™ = s™A,, (no sum on k), 
and note that (12.3) is equivalent to 


(12.5) (ai; — iA) Six = 9, (k not summed). 


32 LINEAR VECTOR SPACES. MATRICES (CHAP. 1 


If this system of linear homogeneous equations is to have a nontrivial 
solution for the s,,, then 7, must be a root of the determinantal equation 


Cy = 6;;A| = WL 
which, when written out in full, is 
ay,—A a12 SSe Ain 
(12.6) a oe | 
Ant an2 ann — A 


This nth order algebraic equation in A has n roots, which are known as 
the characteristic values’ of the matrix A. If these n roots are distinct, 
we can readily show that the system of equations 12.4 yields a set of n 
linearly independent vectors s“), and hence a nonsingular matrix S, as 
required by (12.1), exists. If the roots are not distinct, it may not be 
possible to determine the desired matrix S. 

‘We consider the case when the roots are distinct, and denote them 
by A,,42,...,4,. If we set A, for A, in (12.5) we obtain a system of 7 
homogeneous equations. This system will have a nontrivial solution 
Si S219 - - -s Snie Setting 2, = A, in (12.5) we get the system yielding a 
solution S12, S22, - - +5 Sng. This gives the second column of S. Proceeding 
in this fashion we can determine the remaining columns and hence the 
matrix S, which satisfies the equation 12.2. To show that the trans- 
formation 12.1 is possible, we must demonstrate that the vectors s*™ so 
calculated are linearly independent, so that S possesses an inverse S~}. 
We shall prove this by supposing that the matrix S is singular and reaching 
a contradiction. 

If |S| = 0, the vectors s“) appearing in the columns of S are linearly 
dependent, and hence there exists a set of constants c;, not all zero, such 
that 


cs + cs? +--+ +¢,8™ = 0. 
1 n 


In this expression some c’s may be zero. We may suppose, without 
loss of generality, that the first r c’s do not vanish, so that we have the 
relation 


(12.7) as + es + +--+ cgi) = 0, et 
in which none of the c’s (or s‘*)’s) vanishes. 


* They are also called eigenvalues and the corresponding vectors s‘*) are termed 
characteristic vectors or eigenvectors. 


Sec. 12] REDUCTION OF MATRICES 33 
From (12.4) we deduce the relations 
As = s Ài AGs®) STS aA SSO 
A[A(As™)] = s™ 4,3, Sealers (Ay ss) e 1. 


If we multiply (12.7) by A successively r — 1 times and take account 
of the chain of relations just written, we obtain a system of equations 


Es + CaS?) + NOS + c,s\”) = 0, 


SMA, + cA, + +++ + esh, = 0, 


cs 45-1 as C8750 tee esi ye>* = 


Since none of the c’s or ss vanishes, this system can be satisfied 
only if 


1 1 1 
A= A z A, = 0 
ae Ja tees Aa 


The determinant A, however, is a Vandermondian,° and its value is 
known to be 


A= (A, =A AAs E Ax) re (A, — A) 
Ce Co = Ag) 


(A, — Apa) =U, - 4), 1 >J 


This is never zero if the 4’s are distinct. Thus the assumption that the 
matrix S is singular is incorrect, and hence the matrix can always be 
reduced to the diagonal form whenever the characteristic values of the 
matrix A are distinct. 

If the roots of the equation |A — 41| = 0 are not distinct, the reduction 
of A to the diagonal form by the transformation S-'AS may be impossible. 
In this event there are other canonical representations which are discussed 
in books on higher algebra. In several special cases, however, the 


5 See L. J. Paige and J. D. Swift, Elements of Linear Algebra, Ginn and Co., Boston 


(1961). 
6 See F. D. Murnaghan, Applied Mathematics, John Wiley and Sons, New York, 


(1948); G. Birkhoff and S. MacLane, A Survey of Modern Algebra, The Macmillan Co., 
New York (1941). 


34 LINEAR VECTOR SPACES. MATRICES [Cuap. 1 


reduction of the matrix A to the diagonal form, even when the charac- 
teristic equation |4 — A/| = 0 has multiple roots, can be achieved. We 
turn to the consideration of these cases in the following sections. 


13. Real Symmetric Matrices and Quadratic Forms 
Let us assume that the matrix A = (a,;) in a linear transformation 


(13.1) L, = ase (j= 1,2). 
is real and symmetric, so that a; = a;; (or A’ = A) for all values of i 
and j. We will show that the matrix A can be reduced to the diagonal 
form by the transformation S-14S. Moreover, S can be an orthogonal 
matrix. 

Linear transformations with real symmetric matrices occur commonly 
in the study of deformations taking place in elastic media. Real symmetric 
matrices also enter prominently in the study of real quadratic forms 


(13.2) O(a, Tzs - < -3 En) = Aizti GS h eh) 


which arise in many problems in dynamics and geometry. We can assume 
without loss of generality that the coefficients a,; in (13.2) are symmetric, 
since (13.2) can always be written 


Q(z, Xo, ee TR) 


ü 
z -— 2,2; 
in which the coefficients are obviously symmetric. In déaling with 
quadratic forms we shall always suppose that they have been symmetrized. 


It will follow from the developments in this section that the problems 
of reduction of the set of linear forms (13.1) to the form 


S a Aes, Éa = Aree, a > Ba — ye 
and of the quadratic form (13.2) to the form 
(13.3) Q = hé? + 1.6. +--+ +4,€,? 


are mathematically identical. 


We note first several properties of quadratic forms (13.2). If we intro- 
duce a linear transformation 


(13.4) © %=Sy,é, or x= SE, 
the form Q in (13.2) becomes 
O = 4,(59.5,)5€) 


= Ai jS 85 E45 


Sec. 13] REAL SYMMETRIC MATRICES 35 
We denote the coefficients of é, by c,,, so that 


Q = 45,6. 
where 
(13.5) Cer = ViSx5j1- 
Since a,; = a;;, and i and j in (13.5) are the summation indices, an inter- 


change of k and / does not alter the value of (13.5). Thus Cp, = C, and 
hence the matrix C = (c;;) is symmetric. We thus have the result that 
the symmetry of quadratic form (13.2) is not destroyed by subjecting the 
variables x; to a linear transformation. 

Let us write (13.5) in the form 


Chr = Sy(455551), 


and observe that a,j;s,, is an element in the ith row and /th column of 
the matrix 


Thus 
(13.6), Cyr = Sibir 


can be regarded as the element in the kth row and the /th column of the 
matrix S’B, and 
(13.7) C= SAS. 
We have established a 

THEOREM. If the variables x; in the quadratic form Q = a,;x;x;, with 
a matrix A, are subjected to a linear transformation x, = 5;;&;, with a 
matrix S, the resulting quadratic form has the matrix S'AS. 

We note, as a corollary of this theorem, that the determinant of the 
resulting quadratic form has the value | A| |.S|?. 

If the transformation 13.4 is orthogonal, then S’ = S~ and we can 
write (13.7) as 

C= SAS. 


It follows from this result that the determination of an orthogonal trans- 
formation which reduces the form 13.2 to the sum of the squares 13.3 
reduces to the solution of the matrix equation 


(13.8) SAAS =A. 


This is precisely the problem we considered in Sec. 12. It follows from 
the discussion of that section that the system of homogeneous equations 


(13.9) aisir = Sipps (no sum on k), 


obtained from 
AS = SA 


36 LINEAR VECTOR SPACES. MATRICES [CHapP. | 


(see equations 12.3) will have a nontrivial solution for the vectors s*: 
(Speen: - = ain) if, and only if, the 2s in (13.9) satisfy the equation 


la; — 9,A| = 0, or 
(13.10) |A — Al| = 0. 


If the matrix A is arbitrary, the characteristic equation 13.10, in general, 
has complex roots; and if these roots are distinct, the methods discussed 
in Sec. 12 permit us to calculate a set of n linearly independent vectors 
s composing the matrix S. In the present case, however, the matrix S 
has to be orthogonal and hence real. Now, if the roots of the charac- 
teristic equation 13.10 are real, then it follows at once from equation 
13.9 that the solutions s: (sm Sox, +--+ Snx) Can be taken to be real 
since the a,,’s are real. We prove a 

Tueorem. If the matrix A is real and symmetric, then the roots of the 
characteristic equation |A — AI| = 0 are all real. 

The system of equations 13.9 can be written compactly as 


(13.11) As = se (no sum on k). 
We can regard As as a vector with components 
ansir + liS + °° E Fin Snx (i = 1,2,...,n). 


Let å, be a root of (13.10), real or complex, and s‘) a vector, real or 
complex, satisfying the system 13.11. We multiply (13.11) scalarly by 
s® and get i 


(13.12) s™® + Ag) = |s|? Ar 
Now, the left-hand member in this product (recall definition 6.1) 
s™ © As = A555, (no sum on k) 


is real if a; = a; To prove this, note that the conjugate Of asse 1S 
equal to the original expression, 


OS Six = ajii = lisije 


Since the left-hand member of (13.12) is real, and |s“°|? is real, it follows 
that A, is real. This completes the proof of the theorem. 
We prove next that, if A, and A, are two distinct roots of (13.10), then 
the vectors s‘ and s”, corresponding to these roots, are orthogonal. 
Since s\ and s) satisfy (13.11), we have the identities 


As) = s)Z,, (no sum), 


As) = sA, (no sum), 


SEc. 13] REAL SYMMETRIC MATRICES ar 


where all the vectors involved are real. If we multiply the first of these 
scalarly by s” on the right and the second by s‘) on the left and subtract, 
we get 


ORROD — oft). Ag) — ft). gl 
Ast’) esl s{)- As) = (A, — A,)s™ > s9, = 


and the left-hand member vanishes since st? - As‘) = As” : s because 
of symmetry of 4. This establishes the orthogonality of s® and s, 
whenever the roots A; and A; are unequal. Since equation 13.11 is homo- 
geneous, we can multiply it by a suitable constant making the length of 
s% equal to 1. We shall suppose that this has been done. 

We recall that a set of orthogonal vectors is necessarily linearly inde- 
pendent. Hence, if all roots of |A — 21| = 0 are distinct, the vectors s“ 
will be orthonormal. and, accordingly, the matrix S, accomplishing the 
transformation SAS = A, will be orthogonal. 

It remains to consider the case of reduction of real quadratic forms 
13.2 to the diagonal form 13.3 when the equation 


(13.10) |4 — AI = 0 


has multiple roots. The demonstration that the reduction is possible in 
this case hinges on one important property of all similar matrices, namely: 
the characteristic roots of all similar matrices are equal. The proof of this 
is easy. We replace A in the left-hand member of (1 3.10) by some similar 
matrix S-!AS and obtain the polynomial in 4, 


ISAS — Al| = |S-(A — ADS| 
=S *|-|A — Al" S| 
ree 


It follows that the characteristic equations associated with S'AS and 
A ate identical, and hence their roots are equal. 

Now let us suppose that (13.10) has multiple roots. Let A = A, be 
some root of (13.10), and let us determine the solution of (13.11) s™: 
oe, corresponding to 2 = 24, which is such that s™ - s) == 1. 
This can be done whether 2, is a multiple root or not. We can adjoin 
to the vector s‘? a set of n — 1 orthonormal vectors forming a complete 
system of vectors in our n-dimensional manifold. These vectors can be 
used as a basis for our space instead of the original set of orthonormal 
base vectors a®,..., a‘, and we can pass from the reference frame 
determined by the a‘”s to the new frame by an orthogonal transformation. 
Hence the matrix of the quadratic form 13.2, when referred to the new 
frame, will assume the form A, = S,AS,, where Si is orthogonal. 


38 LINEAR VECTOR SPACES. MATRICES [CHAP. 1 


Moreover, 

(13.13) |A, — Al| =0 

has the same characteristic roots as (13.10). The equation [ef. (13.11)] 
(13.14) AS SA 

for A = å has the solutions s™: (1,0,0,...,0), since we chose it to 


be a unit vector, and s® is one of the base vectors of the new reference 
frame. If we insert this solution in (13.14), we get an identity 


1 T 

0 0 
Ay, 2 = > 

0 0 


from which it follows that the matrix A, has the following elements: 
(13.15) aD =â, aD=aa--- =a) =0, 


The original matrix A is symmetric, and, since orthogonal transforma- 
tions do not destroy the symmetry, the matrix A, is also symmetric.’ 
Thus 


Ay’ = A;, 
and we can write instead of (13.15) 
1) Dee) eae ee aa = = 
adi =n ah) = af) = af) = al = --- = al) =) = 0, 
so that 
A, 0 0 
a) (1) 
O ag, Aan 
Ay ee ee Bete A 2 
(1) (1) 
0 ang ae: Ann 


Thus the quadratic form 13.2, when referred to our new frame, has the 
structure 


QAE FR (7 r.. 


We succeeded in separating one square and reduced the problem to 
a consideration of the form aj})é,£; in n — 1 variables. We can apply 


"For A,’ = (SAS, y = S'A (SoY = S,1AS,, since S-! = S’ for orthogonal 
matrices. 


Sec. 13] REAL SYMMETRIC MATRICES 39 
similar reasoning to the (n — 1) x (n — 1) matrix A, = (a‘}”) and consider 
the form awe é;, ERR e ane | dimensional subspace 
E,_, of E,, determined by the base vectors other than s. In E -1 we 
can calculate a unit vector s’ satisfying the equation 
Ass SA; : 

corresponding to 4 = A,, and construct a new base system by an orthog- 
onal transformation in which s®? is a base vector. This will yield a matrix 


wm o 
0 a® aq? 
As a 33 3n 
(2) (2) 
0 ang nn 


and hence Q of the form 
Q = AE? + Akt + aG Eé (i,j E 
The continuation of this process will reduce the original quadratic form 
13.2 to the form 
Q = 4,6," T AE? T hate Aeon: 
Since each successive reduction is performed by an orthogonal trans- 


formation, the product of orthogonal transformations is equivalent to 
a single orthogonal transformation S. The resulting diagonal matrix A, 


408 0 
Gad, »-- 0 

Ss = A= i : | 
oo ieee, 


contains the number of like roots 4 equal to the multiplicity of the roots 
in |A — Al| = 0. Since the matrix SAS is similar to A, the characteristic 
Toots Ar Aa- = Am of |A — Al| = 0, are identical with those of 
|4 — Al| = 0. 

The directions determined by the characteristic vectors s% associated 
with the matrix A are called the principal directions of the matrix A. 


Problem 
Tf x: (£i to. -> Tn) is a unit vector and Q = 4;;j%;Tj is a real symmetric 
quadratic form with nonsingular matrix A, then the extreme values of Q are the 
characteristic values of A. Prove it. Hint: Maximize Q subject to the con- 
straining condition z,;7; = 1 and deduce the system of equations (aij — Ôi) = 
0, where A is the Lagrange multiplier. 


40 LINEAR VECTOR SPACES. MATRICES [Cuap. 1 


14. Illustrations of Reduction of Quadratic Forms 


We shall interpret the results of Sec. 13 in the language of analytic 
geometry and give two examples providing concrete illustrations of 
reduction of quadratic forms to the canonical form by means of orthogonal 
transformations. 

If we suppose that the dimensionality of space n = 3, and set a;;x,7; 
equal to a constant c, then the equation 
(14.1) G02; ic (i, 7 = 1, 2, 3) 


represents a quadratic surface Q referred to a reference frame with base 
vectors a*. An orthogonal transformation STAS = A, leading to the 
quadratic form 


(14.2) Ayes? + Anko? + Ag&s" = C, 


can be interpreted as a transformation of coordinate axes yielding a 
frame with base vectors directed along the principal axes of the quadratic. 
Let the quadratic exemplifying (14.1) be 


Q = 2x? + 2r? — 152,? + Saja, — 12a, — Wages = c. 


In order to determine the coefficients 4; in (14.2) for this particular 
case, we symmetrize Q and obtain 


Q = 2x,? + 42,2, — 62,25 
+ 4ra + 2x,” — 622%, 
— 675%, — 62,2, — 152,?, 


from which the characteristic equation |4 — 27| = 0 can be written down 
at once. We have 


2—4 4 —6 
|A—AI|=| 4 2-—A —6 = 0 
—6 —6 —15—}2 
Expanding this determinant leads to a cubic 
48 + 1122 — 1444 — 324 =0, 
which has the roots 
à = —2, A, = —18, As = 9. 

Thus, relative to a new reference frame, QO assumes the form 

—2&,7 — 1842 + 9&7 = c, 


representing an hyperboloid. 


Sec. 14] REDUCTIONS OF QUADRATIC FORMS 41 


For the determination of the new base vectors s‘, we have the system 
of equations 13.9, 
A: :Ss. = Sihr (no sum on k), 
or 
(a; cg O:jAWSix => 0. 


Writing these out, we have 
(2 — A,)Siz + 452%, — 653, = 9, 
(14.3) 4s, + (2 a A,) Sex a 655; = 0, 
— 651, — 65, — (15 + Ay) Sax = O. 


Substituting 2, = —2 gives three equations, two of which are identical. 
The linearly independent equations are 


4sy1 + Aso, == 6531 = 0, 
— 654; a 6SH = 1355; = 0. 
Solving these yields the components ers), 
Si = © Sa = — Cs Ss, = 0, 


where c is arbitrary. We determine the constant € so that the length of 
s is unity. Thus 
Su? + Sa tsa = l, 


and hence c = N2 and our normalized components are 


S11 


These determine the first column of the matrix S. 
The substitution of A, = —18 in (14.3) leads to three homogeneous 


equations 
20512 + ASoo a 6530 = 0, 


4519 + 20522 — 6835 = 0, 
— 651, — 6522 + 3532 = 0, 
the solution of which is readily found to be 
Sig = 46, Sop = 40, S32 = C. 


The normalized solution is 


42 LINEAR VECTOR SPACES. MATRICES [CHaP. 1 


The elements entering in the third column of S are determined from 
the system 14.3 by setting 2, = 9. This yields the equations 
—7513 + 4523 — 6533 = 0, 
4513 — 783 — 6593 = 0, 
—6513 — 6593 — 24533 = 0, 
which are satisfied by 
Si3 = C, S23 = C, S33 = — 2C. 
Normalizing to unity, we obtain s® in the form 
Sis = $, S23 = $, S33 = —3. 


Accordingly, the orthogonal transformation yielding the desired canonical 
form is 


1 
$ = = t — =X + 0: T, 
Z2 2 
1 1 4 
Sp = St +t ee + T3, 
Re 3,/2 3,/2 


To illustrate reduction in the event the characteristic equation has 
multiple roots, we take 


Q = 32,2 + 22," + 32,7 + 22,7, = c. 
Here the characteristic equation of the matrix of Q is 
3—A 0 1 
—| 0 2-A 0 | =A — 8/? + 204 — 16 = 0, 
1 0 3-A 


whose roots are A, = A, = 2, å = 4. Hence the quadric surface is an 
ellipsoid of revolution whose equation can be taken in the form 


U(E + E) + 485? = c. 
The equations for the determination of the new base vectors are 
(3 — Asis + Osa + 53, = 0, 
Osiy + (2 — A)soy + O53, = 0, 
Siz + Osa + (3 — A)53, = 0. 
Setting 4, = 2 yields only one equation 


Su + 53, = 0 


Sec. 15] REAL QUADRATIC FORMS 43 


for the determination of s‘, so that the normalized solution can be taken 
as 


The second characteristic root A, = 2 gives the equation 
(14.4) S12 + S32 = 9, 
and since s?! must be normal to s, we have the orthogonality condition 


$4152 + So1Se2 + 531532 = 0, 
or 


(14.5) z m 3 S32 = 0. 
Equations 14.4 and 14.5 state that s,. = 0, Soo = l, S32 = 9. 
Finally, for the determination of the third base vector we have the 
system of equations 
—Si3 + S33 = 0, 
— 2525 = 0, 


S13 — S33 = 0, 


obtained by setting A= 4. The normalized solution of this system is 
Sig = TA sam Oas IR: Hence the matrix S has the form 


N 
© 
= 


1 Ole 
alt 
fa 


from which the equations of connection between the variables x, and ¢; 
can be written down at once. 


0 


te 
0 
a 
z 
15. Classification and Properties of Real Quadratic Forms 
In this section we summarize several properties of real quadratic forms 


(15.1) Q = d;:t; E= o o 25 T 


which are of considerable importance in applications. 


44 LINEAR VECTOR SPACES. MATRICES [CHAP. | 


We have shown that the real quadratic form Q can be reduced by an | 
orthogonal transformation 


(15.2) E, = Sjt; 
to the canonical form 
(15.3) Q = hé? + Af? +°°° + 4,6," 


The problem of reduction of (15.1) to the form 15.3 is equivalent to 
the search of an orthogonal matrix S = (s,;) satisfying the matric equation 


(15.4) SAAS=A, (or S'AS = A), 


where the elements along the diagonal in the A matrix are the roots of 
the determinantal equation 


(15.5) |A — al| =0, 


and A is a real symmetric matrix. 

Since the determinant of S does not vanish, it is clear from (15.4) that 
the rank of A is equal to the rank of A. If the characteristic equation 
15.5 has n nonvanishing roots, then the number of terms actually appearing 
in (15.3) is n. If, however, equation 15.5 has r < n nonvanishing roots, 
then the reduced form 15.3 will have the appearance 


(15.6) O = 4) eee ee 


and we shall say that the rank of (15.1) is r. The number of positive i’s 
appearing in (15.6) is called the index of Q. If we have a form (15.6) 
with p positive and r — p negative /’s, we can introduce a real trans- 
formation $; = (NADE: for terms with positive 4’s and £; = (vis! ae 
for terms with negative /’s so that it assumes the form 


(15.7) Q= é + Ša? 4 EEG E? — Er AR Sen goz 


Thus every real quadratic form Q can be reduced by a real linear trans- 
formation &;' = c,t; to the canonical form 15.7. The matrix (c,), of 
course, is not necessarily orthogonal. 

The form 15.7 provides a means for the classification of quadratic 
forms. 

We consider the following cases. 


1. The index p in (15.7) is equal to n, so that equation 15.5 has n 
positive roots. In this case we say that the form 15.1 is positive definite. 

2. If the index p = 0, so that all roots of (15.5) are negative and the 
rank of Q is n, the form 15.1 is negative definite. 


SEC. 16] SIMULTANEOUS REDUCTION 45 


3. If the index p is equal to the rank r and r < n, then the form is 
said to be positive. On the other hand, if the index is zero and the rank 
r <n, the form Q is negative. 

4. The forms whose canonical representation 15.3 contains both 
positive and negative 2’s are called indefinite. 


We observe that positive and negative definite forms never vanish for 
real nonzero values of the variables x; They vanish if, and only if, all 
xs vanish. In contradistinction, the positive and negative forms may 
vanish for nonzero values of the arguments 2;. To see this, note that, 
ifr < n, then 

Q = 1? + Raber ** ob Ae. 
We can make (15.1) vanish by choosing the x; in (15.2) so that 
= =e, = 0. 


The nonvanishing values of x, will surely exist, since the system of r 
homogeneous equations, 

St; = 0, (i= 1,...,7), 
in n unknowns 2,, has nontrivial solutions whenever r <n. 

It follows at once from (15.4), and from the fact that in a positive 
definite form the 4,’s in A are all positive, that the determinant |a,;| of 
the positive definite form is necessarily positive. The converse of this, 
clearly, is not true. This can be readily seen by noting that |A\— (Al, 
and the positive value of |A| admits indefinite as well as definite forms. 


16. Simultaneous Reduction of Two Quadratic Forms to a Sum of Squares 


We conclude our study of quadratic forms by investigating the possi- 
bility of simultaneous reduction of two real quadratic forms to the sum 
of squares by a single linear transformation. This problem arises, among 
other places, in a study of oscillations of mechanical systems about the 
state of equilibrium. 

Consider two real quadratic forms 


(16.1) Qi = 4,02; and Q; = bitit; 
each of rank n, one of which, say Q;, is positive definite. Let it be required 


to find a linear transformation, not necessarily orthogonal, such that 
both forms reduce to the sum of squares. 

If Q; is positive definite and of rank n, then there exists a linear trans- 
formation x; = ¢;;4;, not necessarily orthogonal, under which Q, reduces 
to the form 


(16.2) om Et FP toot Sn" 


46 LINEAR VECTOR SPACES. MATRICES [Cuap. 1 
Under the same transformation Q, will assume the form 
(16.3) Q: = bij Eiby. 


Now, under a suitable orthogonal transformation €; = d,;n; on the vari- 
ables £;, Q, can be reduced to the form 


(16.4) Q: = Amè, 


and, since orthogonal transformations leave the scalar product ¢;é; 
invariant, the form Q, will be unchanged, and we have 


(16.5) Q1 = Nh: 


Now Q, and Q, are in the desired forms, and, since the product of 
successive linear transformations from x; to 7; is a linear transformation 
£; = SN; it follows that the simultaneous reduction can be accomplished. 

The numbers A, in (16.4) are called the characteristic numbers of the 
form Q, relative to Q,. We proceed to derive the equation for the charac- 
teristic numbers /,. 

We recall that if the variables x, in a form Q = a,,7,2; with a matrix 
A are subjected to a linear transformation x; = sn; with the matrix S, 
then the matrix of the resulting quadratic form is S'AS. The determinant 
of this matrix has the value |S]? |4|. Now let us construct the quadratic 
form 


(16.6) 0= 07-710; 
= (b; — Aa,;)x,x;, 


where A is an arbitrary parameter. Under successive linear transformations 
from the variables x, to n, Q, and Q, assume the forms (16.4) and (16.5), 
and hence (16.6) reduces to 


(16.7) Q = Any + Ane stole Saas Ann 
= în? — ayy’ — > ++ — Ana 


= : (4, = Anè. 


The determinant A in (16.7) is 


(16.8) A = (A, — A — 2) + (2, A), 
whereas the determinant of Q in (16.6) is 
(16.9) D = |b; — 4a;;I. 


It follows from remarks just made regarding the value of the determinant 
in the transformed quadratic form that the determinants D and A can 
differ only by a constant multiple equal to the square of the determinant 


Sec. 17] UNITARY TRANSFORMATIONS 47 


|S| of the transformation from the initial variables x; to the final variables 
n: Since this determinant does not vanish, and since it contains no 
parameter A, the roots of polynomials 16.8 and 16.9 are identical. Taking 
account of the structure of expression 16.8, we conclude that the coeffi- 
cients A, in (16.4) are the roots of the determinantal equation 


bua — Ady, Dip — Ady, tto Din — Adin 
D = ba = Nas Doz — Àd Pe bon =- Nao, = 0. 
bo = AQ na bie m Adra naa | — Aann 


In application of these results to the study of small vibrations of 
mechanical systems about the point of equilibrium, the forms Q, and Q, 
are identified with the kinetic and potential energies of the system. The 
final coordinates 7, are termed normal coordinates, and the characteristic 
numbers 4; are related to normal modes of vibration (see Sec. 89). 


17. Unitary Transformations and Hermitean Matrices 


Ih a variety of circumstances arising in applied mathematics it becomes 
necessary to extend the concept of orthogonal transformations to vectors 
defined in a complex field. 

If we consider a nonsingular transformation 


(17.1): a) =a, (i j=1,....7) 
in which the coefficients a,, are complex numbers and the set of numbers 
(£i, - - - , 8n) represents the components of the vector x, the question 


naturally arises about restrictions that must be imposed on the matrix 
(a,,) if the length |x| of the vector is to be preserved. The imposition of 
the condition of invariance of length, namely, 


t 


Eu! = Titi 


leads at once to the conclusion that [cf. (11.5)] 

(17.2) Gi Qin = Ôj 

where bars, as usual, denote conjugate complex values. We deduce from 

(17.2) that the absolute value of the square of the determinant |a,,| is 1. 
Matrices A = (a,;) whose elements satisfy the conditions 17.2 are called 


unitary, and the corresponding transformations 17.1 are unitary trans- 
formations. We can write (17.2) in the form 


(17.3) A'A =I, 


48 LINEAR VECTOR SPACES. MATRICES [CHAP. 1 


where A is the conjugate matrix formed by replacing every element a;; 
in A by @,;. From (17.3) we conclude at once that A’ = A~. 
A bilinear form 


(17.4) H =a a Cay ee Beene 0) 


ij 9? 
where a,; = d,;, is called a Hermitean form, and the matrix (a) = A, 
corresponding to it, is a Hermitean matrix. It follows from the definition 
of the Hermitean matrix that the elements along its diagonal are real and 
that 


Pah o '= A, 


We observe that the Hermitean forms can assume, for arbitrary z; 
only real values, since 


It is clear that the Hermitean forms are a generalization of real quadratic 
forms. 

One can raise the question of the possibility of reduction of the form 
17.4 to the canonical form 
(17.5) H = AEF aF RRF aa oe REN 
with the aid of the transformation 

e = MS or UE 

where U = (u;;) is a unitary matrix. A computation similar to that 
carried out in Sec. 13 leads to the solution of the matric equation 


(17.6) UAU = A, 


where A is a diagonal matrix. The procedure in this case is, in every 
respect, similar to the one followed in the discussion of real symmetric 
matrices. We multiply (17.6) by U and obtain 


(17.7) AU=UA, 


which represents a system of linear homogeneous equations for the 
determination of vectors wu: (Urp Usp- - , Ung) entering in the columns 
of U. A necessary and sufficient condition that the system represented 
by (17.7) have a solution is that 


(17.8) |A — åI = 0. 


The possibility of constructing a unitary matrix U satisfying equation 
17.6 hinges on the fact that here the roots of (17.8) are also real. The 


PROBLEMS 49 


fact that the characteristic roots A, must necessarily be real follows from 
the observation that UAU is a Hermitean matrix whenever A is Her- 
mitean and U is unitary.” Thus A in (17.6) is Hermitean, and consequently 
the elements along its diagonal are real. E 


Problems 


A= (a;;) = 


to the diagonal form S$ by the similitude transformation C-!AC. Show that 


caf) et) se 9) 


Discuss the meaning of A when it is viewed as an operator characterizing the 
deformation of space. 
2. Diagonalize the matrices: 


1. Reduce the matrix 


mt B= 
ne ome 


1 -1 -i =l 1 2 
—1 1 -l1], 0 —2 1 
—-1 -1 1 0 o> —3 


1 For (U-2AU)’ = U’A(U3Y’ and (UTAUY = U'A(U>Y. Since A is Hermitean, 
A’ = A, and since U is unitary, U’ = U~ and (U~ = U. Thus we have (U1AUY = 
U-*AU. 


2 


TENSOR THEORY 


18. Scope of Tensor Analysis. Invariance 


Tensor analysis is concerned with a study of abstract objects, called 
tensors, whose properties are independent of the reference frames used to 
describe the objects. A tensor is represented in a particular reference 
frame by a set of functions, termed its components, just as a vector is 
determined in a given reference frame by a set of components. Whether 
a given set of functions represents a tensor depends on the law of trans- 
formation of these functions from one coordinate system to another. 
The situation here is identical with that already encountered in Chapter 1. 
In a given reference frame a vector A is determined uniquely by a set of 
components 4,. If a new coordinate system is introduced, the same vector 
A is determined by a set of components B,, and these new components are 
related, in a definite way, to the old ones. It is the law of transformation 
of components of a vector that is the essence of the vector idea, and the 
same is true of tensors. 

Since tensor analysis deals with entities and properties that are in- 
dependent of the choice of reference frames it forms an ideal tool for 
the study of natural laws. Indeed, whether a logical deduction based on 
a conglomerate of observational facts deserves the name of a natural law 
is often determined by the generality of such a deduction, and by its 
validity in a sufficiently wide class of reference systems. This is intimately 
bound up with the possibility of formulating the deduction in the form of 
a tensor equation because tensor equations are invariant with respect to a 
given category of coordinate transformations. The concept of invariance 
of mathematical objects, under coordinate transformations, permeates the 
structure of tensor analysis to such an extent that it is important to get at 
the outset a clear notion of the particular brand of invariance we have in 
mind. We shall suppose that a point is an invariant. In a given reference 
frame a point P is determined by a set of coordinates x’. If the coordinate 
system is changed, the point P is described by a new set of coordinates y‘, 
but the transformation of coordinates does nothing to the point itself. 

50 


SEC. 19] TRANSFORMATION OF COORDINATES 5] 
. . . ame 
Again, a pair of points (P;, P,) determines a vector P,P,. This vector, in 


a particular reference frame, is uniquely determined by a set of components 
s ——> 


A; A transformation of coordinates does nothing to the vector P,P, but 
— 

in the new reference frame P,P, is characterized by a different set of com- 
ponents B,. A set of points, such as those forming a curve or surface, is 
also invariant. The curve may be described in a given coordinate system 
by an equation which usually changes its form when the coordinates are 
changed, but the curve itself remains unaltered. We shall say, in general, 
that an object, whatever its nature, is an invariant, provided that it is not 
altered by a transformation of coordinates. 


19. Transformation of Coordinates 


In Chapter 1 we discussed at some length linear transformations of co- 
ordinates. Here we will deal with real, single-valued, reversible functional 
transformations of the form 
(19.1) Le yf eee, 6s en), (iN n); 
where we use superscripts to identify the variables. A particular set of n 
real numbers (251, £o?, . . . , Zo”) can be thought to specify a point Po in the 
n-dimensional metric manifold covered by a coordinate system X. The 
set of equations 19.1 will be viewed as a transformation of coordinate 
systems, so that the n-tuple of numbers (Yot, Yo, - - - » Yo") Obtained by 
substituting in (19.1) the coordinates (ro, zo’, ... x”) represents the 
coordinates of P, in the Y-reference frame. Since the transformation T in 
(19.1) was assumed to be reversible and one-to-one, we can write 
(79.2) T2: zi=ciy,y,...,y"), (i= 1,2,...,n), 
where the functions! z'(y) are single-valued. To ensure the satisfaction of 
restrictions we have just imposed on the transformation of coordinates, it 
will suffice to suppose that the functions y'(x) in (19.1) are continuous 
together with their first partial derivatives in some region R of the mani- 

i 
oy! does not vanish at 
Ox! 
any point of the region R. It would follow then? that not only a single- 
valued inverse (19.2) exists, but the functions «*(y) in (19.2) are also of 
class C! in some neighborhood of the point under consideration. 


fold V, and that the Jacobian determinant J = | 


1 We will often use the notation x(y) and f(z) to mean ai(y!,...,y") and 
f@',z,..., x”), respectively. 

2 See, for example, I. S. Sokolnikoff, Advanced Calculus, pp. 433-438. We use the 
symbol C” to denote the class of functions which are continuous together with their 
first n partial derivatives. 


52 TENSOR THEORY [CHAP. 2 


We observe that, if the functions y‘() in (19.1) are of class Ct, then, by 

Taylor’s formula, 

y? = a eh aji, 
where a; is the value of dy'’/dx’ evaluated at some point P’ of the region R. 
The point P’ depends, of course, on the choice of values (xt, x°,..., x”). 
Thus the transformation 19.1, with stated properties, is /ocally linear. The 
nonvanishing of the Jacobian guarantees that this system of linear equa- 
tions has a unique solution. Throughout the rest of this book we shall 
suppose that all encountered transformations of coordinates are of the 
form 19.1, in which the functions y’(x) are at least of class C! in some 
oy’ s ; 
region R, and that a # 0 at any point of R. For brevity we shall refer 
z 

to a class of coordinate transformations with these properties as admissible 
transformations. 

As an example of an admissible transformation consider a system of 
equations specifying the relation between the spherical polar coordinates 
a* and the rectangular cartesian coordinates y’, 

ye — ee SIN COS a. 
T: {4? = 2! sin z? sin z, 
y? = q! cos z?. 
If we suppose that 21 > 0,0 < 2? < v, and 0 < r? < 2v, then J Æ 0 and 
the inverse transformation is given by 


mi HOF iy Cn ve OF 
aVF EGF 


2 
Si x“ = tan 
IE: y 


ag 


3 — pete. 
J= tan T 
Problem 


Discuss the transformations in which the coordinates y’ are rectangular 
cartesian: 


1 2 
y! = — y! +—= 2? + r’, 
V6 V6 Vv 
1 1 
(a) ` y? = — a! — — r? + — 3 
V2 V3 V g 
1 1 
y? = — gl -a 
v2 v2 
y = xz Cosg? 
(b) y? = zl sin 2, 


Sec. 20] ADMISSIBLE TRANSFORMATIONS 53 


20. Properties of Admissible Transformations of Coordinates 


From a summary of certain important properties of admissible co- 
ordinate transformations contained in this section, we will see that it is 
quite immaterial what particular reference frame is selected to describe the 
invariant entities. It will be shown that all admissible transformations of 
coordinates form a group, and hence every coordinate system in the family 
can be obtained from the particular one by an admissible transformation. 
This fact is of great moment in the construction of a theory that lays claim 
to its independence of the accidental choice of reference systems. 

THEOREM I. Jf a transformation of coordinates T possesses an inverse 
T- and if J and K are the Jacobians of T and T°, respectively, then JK = 1. 

The proof is easy. We insert the values of x’ from (19.2) in (19.1) and 
obtain a set of identities in y’, 


Sait...) a -- 2s Y")). 
Y 
The differentiation with respect to y’ yields 


Le A a ee 


5 SG 9 ° 9 n). 
oy’ Ox" oy’ 
aya ar) 2!) yg 
dx“ dy’ ax*| | ay* 


Since |0,‘| = 1, we see that J: K = 1. Incidentally, it follows from this 
result that J Æ 0 in R. 
Consider now any two admissible transformations 
T;: yf Re"), 

and 
T,: Nace SO; (i = 12an) 
The transformation 

mT = AE so 2) +s PA o oaa CS 


is called the product of T, and T}, and we write T; = TTi. If the Jacobian 
of T; is denoted by Ją, it follows that 


[azar] _ | 24] | Bu 
3 | ay? Aa! dy? | | ax? 
= Jis 


where J, and J, are the Jacobians of T, and T,, respectively. 


54 TENSOR THEORY [CHaP. 2 


We can state this result as a 

THEOREM II. The Jacobian of the product transformation is equal to the 
product of the Jacobians of transformations entering in the product. 

These theorems enable us to establish an important 

THEOREM III. The set of all admissible transformations of coordinates 
forms a group. 

The truth of the theorem becomes obvious if one notes that 


(a) The fundamental group property, namely, the product of two 
admissible transformations is a transformation belonging to the set of 
admissible transformations, is clearly satisfied. This property is known as 
the property of closure. 

(6) The product transformation possesses an inverse, since the trans- 
formations appearing in the product have inverses. 

(c) The identity transformation (x* = y') obviously exists. 

(d) The associative law T,(727,) = (T372)T, obviously holds. 


These properties are precisely the ones entering in the definition of an 
abstract group. 

As noted in the beginning of this section, the fact that admissible 
transformations form a group justifies us in choosing as a point of depar- 
ture any convenient coordinate system, as long as it is one of those admitted 
in the set. 


21. Transformation by Invariance 


Let F(P) be a function of the point P in the n-dimensional manifold V, 
We will suppose that F(P) is a continuous function in some region R of V, 
and that V,, is covered by some convenient coordinate system X. The 
values of F(P) depend on the point P, but not on the coordinate system 
used to represent P. We call F(P) a scalar point function or simply a scalar. 
In the reference frame X, F(P) may assume the form f(a}, . . . , x”), and, if 
we introduce a new reference system Y by means of a transformation 


(21.1) E U ye 
the functional form of F(P) in the Y-frame is 
(2102) EROE ...3 JAL ...3 x"(y}, eee, y")] = gy’, eeng yoy; 


since the value of f(x!, ..., x”) at P(w,..., x”) is the same? as that of 
BO ean YAPO ©, 9 


* In a specific case, F(P) may represent the temperature of some region of space and 
f(z) is the form which the temperature function assumes in the X-reference frame; 
&(y) is the representation of F(P) in the Y-reference frame. 


Sze: 21] TRANSFORMATION BY INVARIANCE 55 


We can speak of f(x) as being the component of the scalar function 
F(P) in the X-coordinate system, while g(y) is the component of the same 
scalar function in the ¥-coordinate system. Alternatively, we can regard 
the scalar function F(P) as being defined by the totality of components f (x), 
g(y), A(z), etc., each of which is related to one another by the substitution law 
typified by formula 21.2. In other words, once the representation of the 
scalar F(P) in one coordinate system is known, then the form of F(P) in 
any other coordinate system Y is determined by formula 21.2. We call 
this substitution transformation G®: f[z(y)] = g(y) the transformation by 
invariance. 

We observe that, if we have three transformations 7), T2, and 73, where 


i t= x(y), 
Toa: y = y(z), 


Ty: æ= 2[y)], 


and a scalar F(P) whose component in the X-frame is f(x), we can compute 
the transforms of f(x). Indeed, the component gly) OEP) inethes Y- 
frame is determined by the law 


G°: gy) = fU), 
whereas the component h(z) of F(P) in the Z-frame is given by 
G}: hz) = gye). 
On the other hand, using the product transformation T; = TT), we get 
G8: hO = ficly@I}; 


from which it is clear that G? = GG’. 

We can represent these transformations of coordinates and the corre- 
sponding transformation of components of F(P) diagrammatically as in 
Fig. 7. Thus, as coordinates are subjected to a group T of admissible 
transformations, the components of a scalar undergo a certain trans- 
formation G°. The relation between the successive transformations T and 
G° is such that the product of two transformations T,T, corresponds to the 


with 7, = 7,7,, so that 


y ely) 
T Tz Gy Gr 


x z fx) h(z) 


56 TENSOR THEORY [CHAP. 2 


product of two corresponding transformations G,°G,”. When sucha relation 
obtains between any two groups of transformations T and G, the groups 
are said to be isomorphic. The isomorphism between the transformations 
of coordinates and the transformations of functions induced by the trans- 
formation of coordinates is an important characteristic of a class of in- 
variants called tensors. 


22. Transformation by Covariance and Contravariance 


In the preceding section we discussed the transformation of components 
of a scalar F(P) when the coordinates of P undergo a transformation. In 
this section we discuss the law of transformation of entities determined by 
the sets of partial derivatives of a scalar. Sets of partial derivatives of the 


component f(z!,..., x”) of a scalar F(P) are of interest in physics in 
connection with the notion of a gradient of potential functions. 

We consider a continuously differentiable function f(z',..., pa), 
representing the scalar f (P), and a transformation of coordinates 
(22.1) To @ = BG Y: 


If we form a set of n partial derivatives 


of a ð 

(22.2) L Ern, or -{f,:}, 

the question arises: What does the set { f} become when the coordinates 
z’ are subjected to a transformation 22.1? This question is quite without 
meaning unless one specifies precisely what is to be done with the set 22.2. 
These fractions do not automatically “become” anything until one states 
what law he is to use in calculating the “corresponding functions” in the 
Y-frame. In other words, it is necessary to agree on what the term 
“corresponding function” is to mean in a given situation. 

For example, we might calculate the corresponding functions by the 
transformation of invariance G? of Sec. 21; that is, we can insert in each 
function f,,(z!, ..., x"), the values of the 2’s from (22.1). This will yield 
a set of n functions 


(22.3) PG ee Peels ae 8), e eee 


On the other hand, if one has in mind the notion of a gradient of f(P), it 
is necessary to say that the set of functions corresponding to (22.2) is not 
(22.3), but the set of n partial derivatives, 

(22.4) ior ae 


dy!’ ay??? Dy? 


SEC. 22] TRANSFORMATION BY COVARIANCE S 


computed by the rule for differentiation of composite functions, namely, 


TE ee | 
(22.5) Gh: a cers (Ga ee 


If we have a function f(x}, . . . , x") and a transformation 
(ee a, ee. 2), 


the set of functions corresponding to (22.2), determined by the law G? 
(equation 22.5), is 

af _ af ax" 

i 


We can think of the sets of functions {df/Ox'}, {0f/dy'}, {Of]az"}, etc., as 
representing the same entity in different reference frames. At any par- 
ticular point P,(7o!, . .., %") the set 22.2 determines n numbers, which 
can be regarded as the components of the gradient vector, and the set 22.4 
represents the same vector in the Y-coordinate system. 

If we have a set of n functions A,(x),..., 4,{x) associated with the 
X-coordinate system, and if we agree to calculate the corresponding 
quantities B,(y),..., B,(y) in the Y-system by means of the covariant 
law G}, namely, 


(22:6) B{y) = Ze A,{2); 
oy" 


we say that the set {4 (x)} represents the components of a covariant vector 
in the X-coordinate system. The set {B,(y)} represents the same covariant 
vector in the Y-system, and the covariant vector itself is the totality of sets 
of such quantities each related to one another by the covariant law G’. 

As an illustration of the law of transformation of vectors, which is 
quite different from the law G?, consider a set of n differentials 


(22.7) dendan a pt”, 


where the xs are related to the variables y‘ by the formula 22.1. If we 
have two points P,(z',..., a”) and Ba + dz}, ..., 2" + dx”), then the 
set of n numbers 22.7 determines the displacement vector from P, to Ps. 

The same displacement vector when referred to the Y-coordinate 


system has for its components 
(22.8) dy}, dy?, ..., dy", 
where 


G: dy = 2 axe, (ic. eee ok 
on 


58 TENSOR THEORY [CHAP. 2 


Note that the law G?, for the determination of the quantities 22.8, is 
different from G1. If we have a set of quantities A,(x), A,(z),..., 4,(%), 
then the law G2, determining the corresponding quantities B,(y), Bo(y),..., 
B,{y), is 
oy’ 
229 B; = — 
(22.9) om 

The law G? is the contravariant law, and we call the sets of quantities 
transforming in accordance with it the components of a contravariant vector. 

The laws G°, G}, and G? play a fundamental role in the development of 
tensor analysis. 


Ay: 


Problems 


1. Show that if the transformation T: y* = a;‘x/ is orthogonal, then the 
distinction between the covariant and contravariant laws disappears. 
2. Prove the theorem: If f(z!,2?,...,2”) is a homogeneous function of 


of , 
degree m, then ~; 7* = mf. 

3. Given f(a, x?,..., x”) and a set of equations of transformation <’ = 
xi(y!, y?, .. . , Y”), where.each yt = y(r). If the transform of f by invariance is 
gy, y’,...,y”), show that df/dt = dg/dt. Hint: (f| ðx*)(dx*/dt) = df/dt and 
dx*/dt = (dx%/ dy?)(dyi/dt). 

4. Write out the laws of transformation of components of covariant and 


contravariant vectors when T is the transformation from rectangular cartesian 
to spherical polar coordinates given in Sec. 19. 


23. The Tensor Concept. Contravariant and Covariant Tensors 


Consider an admissible transformation 
TT: YS... . , 2”), C= ees 
and a set {f;} of m continuous functions 
le ato); (= 1,2... canes 
defined in some region R of the n-dimensional space referred to the X- 
system of coordinates. 


We associate with the given transformation T a transformation G which 
transforms each /,(x1, 27, ..., x”) into a function 


gi, Ys. ona y”). 
Examples of the transformation G are the transformation of invariance 
and the contravariant and covariant laws introduced in preceding sections. 
But, whatever the nature of the transformation G, it will always depend 


on T, and to emphasize this fact we shall say that G is a function of T. 
We shall call G an induced transformation on the set of functions f. 


SEC. 23] THE TENSOR CONCEPT 59 


Suppose further that G, regarded as a function of T, satisfies the 
following conditions: 


(a) When T is an identity transformation, then G is an identity trans- 
formation. This means that, if y* = x’, then i 


S, 6 CY = f Ys 0 Y”). 


(b) If Tı, To, Ts are three transformations of the type T, and Gy, Gz, Gs 
are the corresponding induced transformations G, and if T; = TT, then 
G, = G,G,. In other words, the sets of transformations T and G are 
isomorphic. If the given set of functions {f;} satisfies conditions (a) and 
(b), we shall say that the set { f;} represents the components f; of a tensor f 
in the X-coordinate system, the tensor f itself being the totality of sets of 
functions {f,(z)}, {g,(y)}, ete. 


It should be remarked that the term tensor was used by A. Einstein* 
only in connection with the sets of quantities transforming in accordance 
with the contravariant and covariant laws. The formulation of contra- 
variant and covariant laws, as well as an outline of the essential features of 
the algebra and calculus of contravariant and covariant tensors, is due to 
G. Ricci. The much broader characterization of tensors by the iso- 
morphism of transformations of coordinates and induced transformations 
is essentially due to H. Weyl and O. Veblen.’ Because of the usefulness 
and commonness of covariant and contravariant laws of transformation 
in applications of analysis to geometry and physics, the term fensor is 
generally used in the sense contemplated by Einstein. This usage is 
followed in the sequel. However, the isomorphism between the laws of 
transformation of coordinates and the induced transformations is so 
fundamental to the idea of a tensor and to the invariant nature of tensor 
calculus that it justifies the degree of emphasis placed on it in the fore- 
going. 

We now turn to a consideration of covariant, contravariant, and mixed 
tensors. It will be convenient to introduce (with Ricci) different notations 
for each type of such tensors so that they can be recognized at a glance. 
Let us consider first a set of n functions of the variables (x},..., £”), 


{A(i; x)} or A(1;2), A(2;2),..., Am; 2). 


Previously we wrote the identifying index i either as a subscript or super- 


script, but now we agree to use superscripts to denote the set of functions 


4 A. Einstein, Annalen der Physik, 49 (1916). 

5 G. Ricci, Atti della reale accademia nazionale dei Lincei, 5 (1889). 

€ H. Weyl, Mathematische Zeitschrift, 23, 24 (1925-1926). O. Veblen, Invariants of 
Quadratic Differential Forms, Cambridge Tract No. 24 (1927), pp. 19-20. 


60 TENSOR THEORY [CHap. 2 


that transform in accordance with the contravariant law and subscripts 
for sets that transform in the covariant manner.” Whenever the law of 
transformation is neither covariant nor contravariant, or when its nature 
is in doubt, we write {A(i; z)}, {B(i; y)}, ete. We now lay down the 
following definitions. 

DEFINITION 1. A covariant tensor of rank one is the entire class of 
sets of quantities {A(i; x)}, {B(i; y)}, (CU; 2}, . . . related to one another by 
the transformation of the form 


£ 


ð 
ð 


a 

- A(&; x), (e= laes mh 

y 

where {A(i, x)} is the representation of the tensor in the X-coordinate 
system, and {B(i;y)} is its representation in any coordinate system Y 
related to the X-system by the transformation T. 

Frequently, we speak loosely of the given set {A(i; x)} as being a tensor, 
but this usage should not conceal the fact that the tensor is the rorality of 
sets of quantities typified by {A(i; x)}. The last set refers to the repre- 
-sentation of the tensor in a particular reference frame and can be spoken 
of as the component of the tensor in the X-coordinate system. However, 
we shall use the term component of a tensor to mean the individual elements 
A(i; x) in the set {A(i; x)}. 

We denote components of covariant tensors by subscripts and often 


suppress the variables x and y entering as the arguments of A’s and B’s. 
Thus l 


BODIES 


G 

B; == 
oy’ 
DEFINITION 2. A contravariant tensor of rank one is the entire class 


of quantities such as {A(i; x)}, {B(i; y)}, ... related to one another by the 
transformation of the form 


A, (covariant law). 


oy' 
Bi; y) = L Ala; x), 


where {A(i; x)} represents the tensor in the X-coordinate system and {B(i; y)} 
in the Y-coordinate system. 


We denote components of contravariant tensors by superscripts. Thus 


a 
BY = A*  (contravariant law). 
x 


7 The only exception to this convention is in the use of superscripts to identify the 
variables xi, yt, etc. These quantities do not transform according to a covariant or 
contravariant law unless the transformation T is affine. 


SEC. 23] THE TENSOR CONCEPT 61 


The definitions of contravariant and covariant tensors of rank one are 
identical with those of contravariant and covariant vectors given in Sec. 
ae. 

We speak of scalars, defined in Sec. 21, as tensors of rank zero. 

We can generalize the definitions of tensors of rank one to include 
tensors of any rank as follows. 

DEFINITION 3. A set of n” quantities A; ;,...;(2), associated with the 
X-coordinate system, represents the components of a covariant tensor of 
rank r if the corresponding set of n” quantities B;;,...:(Y), associated 
with the Y-coordinate system, is given by 


Bagge, T ay dy"? oat oy" 1g '' t ay” 
The tensor itself is the totality of sets of such quantities as {Mii i 
DEFINITION 4. A set ofn” quantities A`: `` +(x) represents the components 
of a contravariant tensor of rank r in the X-coordinate system whenever the 
corresponding set B= ` *(y) of n” quantities in the Y-system is given by 


As an illustration we note that the components of the covariant tensor 
of rank two transform according to the law 


dy’ dy? 


whereas the components of the contravariant tensor are given by 


B,{y) = Azp(2), 


z iDa J 
B” (y) 23 oe i Aa): 
There are n? components in each set. 
We define next the mixed tensor. 
DEFINITION 5. The totality of sets of n™* quantities, typified in the 
X-coordinate system by the expressions Aix}. {¢(z), is a mixed tensor, 
covariant of rank rand contravariant of rank s, provided that the corresponding 


quantities Bia?) ‘4e(y) in the Y-coordinate system are given by the law 
Ox Ox™ Ox" dy”! dy’ 3y” APiBa °° "Bs 


Bua de oe ee | ee 8 ieee ea 


aiai T Gyi gye Oy” Aah Oah* Da” 
We note that this law for the transformation of components A} of the 


Ox% Oy? : 
mixed tensor gives Bi(y) = By S3 A®(x). As a simple example of a mixed 


62 TENSOR THEORY [CHaP. 2 


tensor that already has occurred in our discussion, we cite the Kronecker 
delta ôi. Thus = x E = = 
the definition of covariant, contravariant, and mixed tensors satisfies 
properties (a) and (b), stated in the beginning of this section, is given in 
Sec. 24. 

To distinguish tensors defined over a region of space from tensors whose 
domain of definition is a single. point, one occasionally speaks of the 


former as constituting a tensor field. 


ôl. The verification of the fact that 


24. Tensor Character of Covariant and Contravariant Laws 


We will verify that the induced transformations defined in the preceding 
section satisfy the isomorphism conditions stated in Sec. 21. The fact that 
the transformation of invariance (leading to tensors of rank zero) fulfills 
these conditions was noted in Sec. 21. The proofs for contravariant and 
covariant tensors are special cases of the proof for a mixed tensor. Accord- 
ingly we consider a mixed tensor typified by the set of functions Ajis.: -i i(x) 

The law G for the transformation of mixed tensors is : 


». . o ———— oS S A 


(241) BEG) = 


and we must show that 

(a) if f=, tiene G l 

(b) if f= 7,7 then E = G.G,. 
Now, if T = J, then 


eM = ya, g*r = yt, 
and hence 
O ee GE 


Moreover, T-1 = J, so that 


so that 


SEC. 24] TENSOR CHARACTER OF LAWS 63 


Inserting these values of partial derivatives in (24.1) gives 
Be O omnlggals Ei) 
Hence G = Jif T= I. 

Suppose now that, under a transformation 7,, the variables x’ transform 
into y’, and the variables y' transform into z’ by the transformation T>. 
The corresponding induced transformations G, and G, yield 
Ox! Ox dy y”: P 

E : pet (2), 


24.2) Gy: BR f(y) =F | 7 
( ) 1 l (y) oy oy" Ox? Oxbs #1 


and 


(24.3) Gyr Ci f(z) = SS Bey), 
Us 


Now, under the product transformation T = T,T,, the variables x’ 
go into y’ and the y‘ into zi, so that T, carries the x‘ into the z’. Inserting 
the values of the B’s from (24.2) into (24.3) gives 


; a Ro Jy” oe dy*"\ (327 bine 0278 
G,G,: C3 seeds (z) ras 0z“: ou") = al 
Ox") z= (= a TERET) 
x T ae 0 i Ma aa). 
= dy) \ax* ga”) ™ ni) 


Performing the summation on «’s and fs yields 
Me Ou dat az 0z” og... 
e FB AR 
The resulting law G is precisely the law of transformation of the com- 
ponents of a mixed tensor when the variables x’ are transformed into the 
zt by the transformation 73. Thus the law of transformation G is transitive, 
and this completes the proof. 

The results for covariant and contravariant tensors appear as special 
cases obtained by suppressing the superscripts or subscripts. 

The only types of tensors with which we will deal in this book are scalars, 
covariant, contravariant, mixed, and relative tensors. The last are defined 
in Sec. 28. 

We establish next a useful property of the law of the transformation of 
tensors, which is frequently used in the sequel. 

Let the components of a mixed tensor in the X-coordinate system be 
denoted by 4/::!:/s(x) and its components in the Y-system by Bi?! !/:(y). 


64 TENSOR THEORY [CHAP. 2 


Then, from the law of transformation of mixed tensors we can write 


. o o mme pee o o 


y” Oy" ox? oxt: a 


(24.4) Bi: #(y)= 


On the other hand, if we are given the components Bii.: is(y), the com- 
ponents A ia) of the same tensor in the X-reference frame are 
determined by the formula 


oe Out ey” Oy? na 


(24.5) APL. B(x) = 


We note that we can obtain (24.5) from (24.4) formally by treating the 
partial derivatives and sums in (24.4) as though they were fractions and 
products appearing in simple algebraic expressions. 

From the structure of formulas (24.4) and (24.5) we deduce an important 

THEOREM. Ifall components of a tensor vanish in one coordinate system, 
then they necessarily vanish in all other admissible coordinate systems. 

This particular theorem is of profound significance in the formulation 
of physical laws. It states, in effect, that, if a certain law is implied by the 
vanishing of components of a tensor in one particular coordinate system, 
then the rules for transformation of the tensor components guarantee that 
they will vanish in all admissible coordinate systems. A physicist has 
little interest in the formulation of a law that might be valid only in some 
special reference frame. Indeed the notion of invariance and the universality 
of physical laws is the cornerstone about which mathematical physics is 
built. 


25. Algebra of Tensors 


In this section we establish several rules of operation with tensors, 
which are algebraic in character. 

THEOREM I. The sum (or difference) of two tensors which have the same 
number of covariant and the same number of contravariant indices is again 
a tensor of the same type and rank as the given tensors. 

Proof. Consider two tensors A(x) and A(x) of the same type and rank 
defined at the same point P, and the corresponding laws of transformation: 
Ox Ox" oy”) oy’ 


a 1 ae z 
o 1 ox” Oy? oU - 

RR E hee 
dy" Dt Jah a+ ai) 


Bry) = APL B(x), 


ay) = 


SEc. 25] ALGEBRA OF TENSORS 65 
Then 
r ale 3 a 
Ox | (20 oy? 
= (—----—-) - (2 BY a e (I 
=e eek O 


It follows from this that A + A is a tensor, and we write 


Ab: EA mna), 
which is a tensor of the same type and rank as the given tensors. 

It is clear from the laws of transformation of tensors that, if each 
component of a tensor is multiplied by a constant, the resulting set of 
functions is a tensor. This fact, in conjunction with Theorem I, permits 
us to state a 

COROLLARY. Any linear combination of tensors of the same type and 
rank is again a tensor of the same type and rank. 

THEOREM II. The equation Aù.. f; (x) = Ahi... (x) is a tensor equa- 
tion; that is, if this equation is true in some coordinate system, then it is true 
in all admissible systems. 

Proof. It follows from Theorem I that the difference of two tensors isa 
tensor. Hence 

As, Bs — Abr ls = 0. 
However, we proved in Sec. 24 that, if all components of a tensor vanish 
in one coordinate system, they vanish in all admissible coordinate systems. 
We shall call the tensor all whose components vanish the zero tensor. 

THEOREM III. The set of quantities consisting of the product of each 
element of the set A}t::: jo(x), representing a tensor A, by each element of 
the set Aj. .. (x), representing a tensor A, defines the tensor s, called the 
outer product. This tensor is contravariant of rank q + s and covariant of 
rank p +r. 

From the definition of outer product, the components of in the 
X-reference frame are given by the formula 


wae Felina | 
The fact that the set of functions Ai ‘11 defines a tensor follows directly 
from the law of transformation of components A;+...;¢ and a ; 


We will denote the outer product .¥ of A and A by writing the symbols 
in juxtaposition. Thus «/ = AA. It is obvious that the outer product is 


distributive with respect to addition, so that 


l (A + B)C = AC + BC. 
We introduce next the operation of contraction which yields tensors. 


66 TENSOR THEORY [CHAP. 2 


THEOREM IV. Jf, in a mixed tensor, contravariant of rank s and covariant 
of rank r, we equate a covariant and a contravariant index and sum with 
respect to that index, then the resulting set of n’**? sums is a mixed 
tensor, covariant of rank r — 1 and contravariant of ranks s — 1. 

To avoid complications in writing we illustrate the procedure used in 
the proof by considering a mixed tensor Aj,,. We have 


dy‘ dxf Ox’ OL? ,, 


If we equate the indices i and k and sum, we obtain the set of n? quantities 


Oy’ Ox? Ax’ ax? |, 


Thus B},, = B, is a covariant tensor of rank two. 

In this case we can obtain three different covariant tensors of rank two 
by performing the operation of contraction on other covariant indices. 
We observe that, when as a result of contraction of one or more pairs of 
indices there remain no free indices, the resulting quantity is a scalar. 

If it is possible to apply the operation of contraction to the outer product 
of two tensors A and 4, the result is a tensor called the inner product of 
A and A. We denote the inner product by the symbol A: A. The proof 
that A - Aisa tensor is immediate, for the outer product of two tensors is a 
tensor, and the operation of contraction yields a tensor. 

Example. Consider the tensors A4;,(x), A,(x), and A*(x). If we form 
the outer product 4;;A, = A;;,, we obtain a covariant tensor of rank 
three, and hence no contraction is possible here. On the other hand, the 
outer product of A,;; and A" gives a mixed tensor A,,A* = A*,, and in this 
case we can contract to get a covariant tensor A% or A% As already 
remarked, the tensor Aj,, may be contracted in three different ways to 
yield Avy, Ája» and A%,,. The tensor Aij,, can be contracted twice in 
several ways. The contraction of A’ yields a scalar. 


26. Quotient Laws 


In this section we give two useful theorems which will enable us to 
establish the tensor character of sets of functions without going to the 
trouble of determining the law of transformation directly. 


SEC. 26] QUOTIENT LAWS 67 


We use the term inner product for sums of the type A(x, i,..., i,)A, 
(or A(a, is, . . . , 1,)A*) whether the set of functions A(i, ..., i,) represents 
a tensor or not. We also speak of tensors of rank one as vectors. 

THEOREM I. Let {A(i,, ig, ... ,i,)} be a set of functions of the variables 
x’, and let the inner product A(, iz,..., i,)&*, with an arbitrary vector &, 
be a tensor of the type Aj ; e(a); then the set A(i,,...,i,) represents a 
tensor of the type Azy. ..%,(@). 

In order to avoid writing out long formulas for the transformation of 
tensors with many covariant and contravariant indices, we will establish 
this theorem for the set of n? functions A(i, j, k), which has all features of 
the more involved cases. Let us suppose that the inner product A(q, j, k)é* 
for an arbitrary vector é*(x) yields a tensor of the type Aj(x). We will 
prove that the set A(i, j, k) is a tensor of the type Al, By hypothesis 
A(x, j, 4)§* is a tensor of the type Al; hence its transform B(a, j, k)q* is 
given by the rule 


; Ox’ oy’ 
B = — ; 
(a, Js kyn oy" Ox? {A(A, B, y)é Jie 
where 
ð À 
P(e) = TO) 
y 


Inserting this expression for ¿^ in the right-hand member of the above 
formula and transposing all terms on one side of the equation yields 
a Da” ay’ 
dy” dy" dxf 


| BC. j k)— A(A, B, 0) jr = 0. 

However, 7*(y) is an arbitrary vector; hence the bracket must vanish, and 
we obtain 

Da? Ox" dy’ 
dy" dy" ax? 


B(a, j, k) = A(A, B, 7). 
This is precisely the law of transformation of the tensor of the type A}, 
Clearly, we can state an analogous theorem in which the vector § is a 
covariant vector. For example, if A(i, j, k, #)&. is known to be a tensor of 
the type Aj, for an arbitrary vector £» then A(i, j, k, a) = Aig. On the 
other hand, if A(i, j, k, a), = A”, then Ali, j, k, a) = A¥**. These 
expressions suggest that an algorithm of division can be employed to 
determine the tensor character. Thus let 4(i, j, k, aE, = Aj, and write 
symbolically. 
Afi, j, k, a) = 3 . 


68 TENSOR THEORY - [CHaP. 2 


Now, if we should regard the covariant quantities appearing below the 
division line as contravariant when written above the line, we have 
A(i, j, k, «) = ALE, 


where &” is the symbolic reciprocal of &,. From the product Aai we see 
that A(i, j, k, «) = Aix. Similarly, if AQ, j, k, x), = Ak then 


ik a 
= AtskEa — A'e 
a 


On the other hand, if A(«, j, k)é* = A}, 


AGI: j= 


In the division algorithm the contravariant quantities appearing below the 
division line are to be regarded as covariant when written above the line. 

THEOREM II. Let {A(i,,..., i,)} be a set of n” functions defined in the 
X-coordinate system, and let {B(i,,...,i,)} be the corresponding quantities 
in the Y-system. If, for every set of vectors with components €,, relative to 
the X-coordinates and y, relative to the Y -coordinates, we have the equality 


BOTE fone = A(m,...,% 5) ee. 


(that is, the inner product is a scalar), then the set of functions A(i,,..., i,) 
represents a contravariant tensor of rank r in the X-coordinate system. 
Proof. Since the £, are the components of a covariant vector, 


ED yf! n” 
ai Ox"? Bi 
Therefore 
ay? w) 
BB) E — By... EE O 
| ( 1 B,) ( 1 o) Jx” Qaa "By "1B. 
However, 7j),..., 7? are arbitrary; hence the term in the bracket must 
vanish. Therefore 
oyf" 


B(B,, oo = 3 Be) za 


which shows that 


= a1 ox” 


Ale ee 


This particular form of the quotient law is taken by some authors as 
the definition of the contravariant tensor of rank r. Thus, if the multilinear 
form Alu, .. . , ER 6 is an invariant, then Alo . -o 
A% r, provided that the 5,, are the components of arbitrary vectors. , 


SEC. 28] RELATIVE TENSORS 69 


On the other hand, if A(a,,..., ee cre a iS an invariant, for an 
arbitrary choice of &°’s, then 


Alas, ..+5%,) = AP: 


r 


v 


It is obvious from proofs of Theorems I and II that many other quotient 
laws can be stated. For example, if the inner product A(i, «)é,; of the set 
of n? functions A(i, j) with an arbitrary tensor is a covariant tensor of 
rank two, then A(i, j) represents a mixed tensor of the type AÌ. The reader 
can prove this fact by following the pattern used in proving Theorem I. 
The tensor properties of the set A(i, j) may be surmised from the division 


algorithm. Thus, if A(i, «)é,; = X; then A(i,«) =—~. Now if we 

æ- aj r 
write the symbolic reciprocal of &,; as 7, we have A(i, «) at! = 
A yE" saw | a 


27. Symmetric and Skew-Symmetric Tensors 


When an interchange of two covariant (or contravariant) indices in the 
components Aj!) (a) of a tensor does not alter the value of com- 
ponents, the tensor A is said to be symmetric with respect to those indices. 
For example, a covariant tensor A,,(x) is symmetric if A; (x) = A,,{2). 
The definition of symmetry of tensors obviously would not be satisfactory 
if the symmetry of its components were not preserved under the coordinate 
transformations. To see that this is indeed so, let us suppose that 
Aga k= Aj,i,---i,(")- Then Oe Ae = 0. However, 
the difference of two tensors is a tensor; and if a tensor vanishes in 
one coordinate system, it vanishes in all admissible systems. Hence 
By i+ Y) = BY): 

We may say that a tensor is skew-symmetric (or antisymmetric) with 
respect to certain indices whenever an interchange of a pair of covariant 
(or contravariant) indices in the components merely changes the sign of 
the components. The skew-symmetry of tensors is likewise an invariant 
property. The proof of invariance of the skew-symmetry property is 
similar to that given for symmetry. However, as an exercise the reader 
may find it instructive to construct a proof based on the use of the law of 
transformation of components 4}! ...37- 

We will extend the notions of symmetry and skew-symmetry in Sec. 40. 


28. Relative Tensors 


We recall that a function f(a!,...,«”) represents a scalar in the X- 
reference frame whenever in the Y-reference frame, determined by the 


70 TENSOR THEORY [CHaP. 2 


transformation xz’ = x(y',...,y"), the scalar is given by the formula 
ey, ...,y") =f[zy), ..-, ey]. We will encounter functions f(z) 
which transform in accordance with the more general law, namely, 


i 


W 


? 


(28.1) aly’, gy”) =e) ae O 


į 


where denotes the Jacobian of the transformation and W is a 


constant. We observe that, if the function f(x) transforms in accordance 
with the law 28.1, then 


fi wW o i |w f) k |W 
ro =A =a a] | a5 
f k |W 
= g(y) a 


where we have made use of Theorem II of Sec. 20. Thus the formula 28.1 
determines a class of invariant functions known as relative scalars of 
weight W. 

A relative scalar of weight zero is the scalar defined in Sec. 21. Some- 
times a scalar of weight zero is called an absolute scalar. 

A relative scalar of weight 1 is called scalar density. The reason for this 
terminology may be seen from the expression for the total mass of a 
distribution of matter of density p(z', x®, x°), the coordinates x’ being 
rectangular cartesian. The mass contained in a volume 7 is given by the 


integral M =| | | ove. oC yde air des lf the coordinates a Tare 


changed with the aid of the equations of transformation x’ = x'(y!. y?, y°), 
(i = 1, 2, 3), the mass M is given by the aa 


w= fo 
“fee Y * dy* dy? dy’. 


It is clear that the density of distribution when referred to the ¥-coordinates 


=; | dy" dy* dy® 


N Ox 
1s p(y) = p(x) ) — 

ply) pt la 

We can also generalize the law of transformation of components of a 
mixed tensor by considering the sets of quantities Aah, I(x) which 
transform according to the formula 
Oa? |" yi Oy On  Oxfr 


28.2) Birds Oo |, CUE a, See 
( ) ty (y) = Oy? ana in y” dyir fant ale): 


SEC. 28] RELATIVE TENSORS 71 


The sets of quantities A}... (s(x) obeying this law of transformation are 
called the components of a relative tensor of weight W. 

From the discussion in Sec. 24, and from the transitive property of 
Jacobians, namely, » 


aa!|_ | Ilaw 
0z’ Oy" || dz | 


if follows that the transformation 28.2 is transitive. In addition, from the 
linear and homogeneous character of this transformation it follows that 
if all components of a relative tensor vanish in one coordinate system, 
they vanish in every coordinate system. An immediate corollary of this 
is that a tensor equation involving relative tensors when true in one co- 
ordinate system is valid in all coordinate systems. In this case the relative 
tensors on two sides of equations must be of the same weight. 
A little reflection will convince the reader that 


(a) Relative tensors of the same type and weight may be added, and the 
sum is a relative tensor of the same type and weight. 

(b) Relative tensors may be multiplied, the weight of the product being 
the sum of the weights of tensors entering in the product. 

(c) The operation of contraction on a relative tensor yields a relative 
tensor of the same weight as the original tensor. 


To distinguish mixed tensors, considered in the preceding sections, from 
relative tensors, the term absolute tensor is frequently used to designate 
the former. We shall encounter several relative tensors in applications of 


tensor theory. 
Problems 


1. Given the relation A(i, j, k)B* = C?, where B’* is an arbitrary symmetric 
tensor. Prove that A(i, j,k) + A(i, k,j) is a tensor. Hence deduce that, if 
A(i, j, k) is symmetric in j and k, then A(i, j, k) is a tensor. 

2. Given the relation A(i, j, k)B* = C?, where B} is an arbitrary skew- 
symmetric tensor. Prove that A(i, j,k) — A(i,k,j) is a tensor. Hence, if 
A(i, j, k) is skew-symmetric in j and k, then A(i, j, k) is a tensor. 

3. lf aiT) dz’ dxi is an invariant for an arbitrary vector dx?, and a(i, j) is 
symmetric, show that a(i, j) is a tensor @;;. 

4. If a,; is a tensor, show that At, the cofactor of a;; in |a;;| divided by 
\a;;| # 0, is a tensor. 

5. If (2), ..., 2") is a scalar, show that {02¢/@r‘ dx} is a tensor with respect 
to a set of linear transformations of coordinates. 

6. If jaz; — 46;3| = 0 for 4 = 1, in one set of variables, then |a; — 46;;/|= 0 
for A = å, in the new set of variables. In other words, the roots of the polynomial 
|a;; — 4b;;| are invariants. 


12 TENSOR THEORY [CHAP. 2 


7. Prove that a tensor with skew-symmetric components in one coordinate 
system has skew-symmetric components in all coordinate systems. 

8. Show that every tensor can be expressed as the sum of two tensors, one of 
which is symmetric and the other skew-symmetric. 

9. Show that the tensor equation a;’2; = «å;, where « is an invariant and 4, 
an arbitrary vector, demands that a; = 6,'a. 

10. Prove directly from the law of transformation of components that sym- 
metry of a tensor is an invariant property. 

11. The square of the element of arc ds appears in the form 


ds? Sart) dx’ dzi. 


Let T be an admissible transformation of coordinates z* = x‘(y!,...,y”); then 
ds? = h;; dyt dy’. Prove that |g;;| is a relative scalar of weight two. Hint: 


dx” dxf BE 
h;y) = By ay Zap), and recall the rule for multiplication of determinants. 


12. How many independent components are there in a skew-symmetric tensor 
of rank two? 

13. If a;; is a skew-symmetric tensor and A? is a contravariant vector, then 
a,;;A*A? = 0. 

14. Prove that, if A(i, j, k)A‘B’C,, is a scalar for arbitrary vectors A’, B’, and 
Cr, then A(i, j, k) is a tensor. 


29. The Metric Tensor 


In Sec. 5 we introduced the idea of n-dimensional space E,, by extending 
the concepts familiar to us from our experience with ordinary Euclidean 
geometry. Thus, in defining the length |x| of a vector x, we used the 
generalized formula of Pythagoras, |x| = Vx'x', where the x‘ are the 
components of the vector x referred to a set of orthogonal cartesian axes. 
(See Sec. 5.) If we now consider a displacement vector dx’, (i = 1,...,n), 
determined by a pair of points P(x) and P’(x + dx), wherein the coordinates 
x‘ are orthogonal cartesian, the formula of Pythagoras gives for the 
square of the distance between P and P’ the expression 


(29.1) ds? = ddr, (i=1,2,...,n). 


We shall call ds the element of arc in E,. 
A change of coordinate system, determined by the transformation 


(29.2) o = AG i. wn 


permits us to write the formula 29.1 as 


(29.3) ds? = — — dy” dy’, 
y oy 


SEC. 29] THE METRIC TENSOR 73 


since dx! = (ðxt/ðy*) dy*. We can thus write the formula for the square of 
the element of arc in the Y-reference frame as a quadratic form 


(29.4) de. C y x 
where the coefficients g,,(y) are defined by 


(29.5) ne aa! Oa 

Eag Y Jy" ay? ' 
These coefficients are functions of the variables (y’), and they are obviously 
symmetric with respect to the indices « and £. 

Since the square of the element of arc ds is an invariant, we conclude 
(see Problem 3) that the set of functions g,,(y) represents a symmetric 
tensor. This tensor is called the metric tensor, because, as will be shown 
in Chapter 3, all essential metric properties of Euclidean space are com- 
pletely determined by this tensor. 

We have obtained the formula 29.4 by starting with expression 29.1, 
which is characteristic of the Euclidean space. A transformation of co- 
ordinates 29.2 clearly does not alter its metric properties, and formula 
29.4 simply enables us to calculate distances in the Euclidean space when 
it is covered by a coordinate system Y. By starting with the form 29.1 and 
the transformation 29.2, we have shown that the set of n functions 29.2 
satisfies a system of 4n(n + 1) partial differential equations 29.5, in which 
the g,,(y) are known functions of the variables y. Now, if the functions 
Zag are specified arbitrarily, the system of ¿n(n + 1) partial differential 
equations 29.5 for n unknown functions x*(y), in general, will have no 
solution. In the event the g,,,’s are such that the system 29.5 has a solution, 
the existence of a transformation of coordinates which reduces the quad- 
ratic form 29.4 to the sum of squares 29.1 is assured. In that event the 
metric tensor g,, defines an Euclidean manifold. If, on the other hand, 
the functions g,,(y) are such that the system 29.5 has no solution, then no 
admissible transformation of coordinates exists which reduces the ex- 
pression 29.4 for the square of the arc element to the Pythagorean form 
29.1. We shall say then that the manifold is non-Euclidean. A set of 
necessary and sufficient conditions for the integrability of equations 295 
will be deduced in Sec. 39. 

We suppose in the remainder of this chapter that our tensors are defined 
in metric manifolds and that the element of arc ds is given by the quadratic 
form ds? = g(x) dx' dx’, where the g,;'s are functions belonging to the 
class C!. We also assume that the symmetric tensor g,,(x) is such that 
|g;| = 0 at any point of the region under discussion, but do not assume 


that our manifold is necessarily Euclidean. 


74 TENSOR THEORY [CHAP. 2 


Problems 


1. Let E, be covered by orthogonal cartesian coordinates x’, and consider a 
transformation 
x! = y! sin y? cos y?, 
x? = y! sin y? sin y’, 
z3 = y! cos y”, 
where the yê are spherical polar coordinates (yl =r, y =0,y* = ¢). What are 
the metric coefficients g;;(y)? 
2. Let E, be covered by orthogonal cartesian coordinates x', and let 
xi = y! cos y?, 
a® = y' sin y’, 
z3 = y? 
represent a transformation to cylindrical coordinates y’. Find the expression 
for ds? in cylindrical coordinates. l : ws 
3. Let E be covered by orthogonal cartesian coordinates x’, and let x* = a;‘y’, 
\a;*| #0, (i,j = 1, 2,3), represent a linear transformation of coordinates. 


Determine the metric coefficients g;;(y). Discuss the case when the transfor- 
mation is orthogonal. 


30. The Fundamental and Associated Tensors 


Let g,,(x) represent a symmetric tensor such that the g,,(7) belong to 
class C? and g = |g,,| 4 0 at any point of the region. We construct, with 
the aid of the set of functions g,,(x), a new set of functions g(x), repre- 
senting a contravariant tensor, which is such that g%g,; = ôi. The tensors 
2,;(x) and g(x) will play an essential role in all our subsequent consider- 
ations, and for that reason they will be called the fundamental tensors. 

Let us form a set of n? functions 


27 


(30.1) g(i,j) = ss ; 
£ 


where G” is the cofactor of the element g, in the determinant g. The 
notation used in the definition 30.1 anticipates that the g(i, j) form a 
contravariant tensor, and, indeed, we will prove that they define a sym- 
metric, contravariant tensor g”. The symmetry of the set of functions 
g(i, j) follows directly from the observation that the determinant obtained 
by deleting the ith row and the jth column in a symmetric determinant 
Zi; has the same value as the determinant obtained by deleting the jth 
row and the ith column. We prove next, by means of a quotient law, that 
the g(i,j)’s transform according to a contravariant law. We first note 
that, if é’ is an arbitrary contravariant vector, then 


(30.2) Ši = gai 


SEC. 31] CHRISTOFFEL’S SYMBOLS 1 


is an arbitrary covariant vector, since |g,| # 0. Now, if both sides of the 
formula 30.2 are multiplied by g(f,’) = G”/g and summed on ij, we get 


Ge 
(30.3) gO 1)é; = — ent. . 
g 


However, by (7.4), G’'g,; = gôf, so that (30.3) can be written as 
A g(B, DE: = E. 


Since ¢, is arbitrary, we conclude from Theorem I of Sec. 26 that g2(8, i) is 
a contravariant tensor of rank two. We can thus write (30.1) as 

GY 

& 
The reciprocal relation g%g,; = ôi follows directly from the fact that 
G'g,, = dig. Incidentally, we can conclude that the set of cofactors G? 
represents a contravariant tensor of weight two. This follows from 
Problem 11 of Sec. 28, where it is indicated that the determinant |g;,| is a 
relative scalar of weight two. 

A tensor obtained by the process of inner multiplication of any tensor 
A}... with either of the fundamental tensors g,; or g” is called a tensor 
associated with the given tensor. 

As an illustration of this definition consider a tensor A,,, and form the 
following inner products: g*'A;;, = A%y, BPA = At 8A , NA j,.- 
All these tensors are associated with the tensor A;,,. Operating on these 
tensors with g” again, we can form other associated tensors. It will be 
observed that the operation of inner multiplication of g, with any tensor, 
say AÏ, lowers the index with respect to which the summation is performed. 


lm? 


Thus g,,4i/% = Aid, while g” At = A}*. The procedure of raising and 


(30.4) g” 


n 
lowering indices is clearly reversible. In the foregoing formulas the 


position occupied by the raised (or lowered) index is indicated by a dot. 
In general, such systems as g"*A;, = A; and gA. = A‘, are different. 
They are identical whenever A; = Aji. 

It is possible to interpret all tensors associated with a given tensor as 
representing the same tensor in different reference frames. This inter- 
pretation is particularly simple for the covariant vector A,, and its associated 
vector g’*A, = A’, whenever the space is Euclidean. We will return to 
this matter in Sec. 45. 


31. Christoffel’s Symbols 


We introduce in this section certain combinations of partial derivatives 
of the fundamental tensor g,,(x), which will prove useful in the development 


76 TENSOR THEORY [CHaAP. 2 


of the calculus of tensors. Let us construct a set of functions denoted by 
the symbol ' 

ar 1 (22a 0g ik 0g; ) = Lm 3s 
31.1 , k] = -|2 + 2 M, meae c= e ee Eh 
CMA = S a (i,j ) 
and call them the Christoffel 3-index symbols of the first kind. The set of 
functions 


(81.2) (o = ee 


where g** is the contravariant tensor, constructed with the aid of the g,,’s 
in the manner described in the preceding section, are the Christoffel 3- 
index symbols of the second kind. 

Evidently there are n distinct Christoffel symbols of each kind for each 
independent g,,;, and, since the number of independent g;,,’s is ¿n(n + 1), 
the number N of independent Christoffel symbols is N = 3n°(n + 1). We 
proceed to deduce several properties and identities involving Christoffel’s 
symbols, which will prove useful to us in the sequel. 

It is clear from definitions 31.1 and 31.2 that the Christoffel symbols 
are symmetric with respect to the indices i and j. Thus 


(31.3) [ij, k] = (ji, k], 


and 


(31.4) | y = (a) 


We see from the defining formula 31.2 that we can pass from the 
; k 
symbol of the first kind [ij, «] to the symbol is by forming the inner 
Y 
product g**[ij,«]. Now, if we multiply equation 31.2 through by Lup» 
and recall that g,.9"* = 6%, we get 


(31.5) sua, | = Oslo = li, 81 


Formulas 31.2 and 31.5 are easy to remember if it is noted that the op- 
eration of inner multiplication of [ij,«] with g** raises the index and 


replaces the square brackets by the braces. The multiplication of bi by 


grze» ON the other hand, lowers the index and replaces the braces by the 
Square brackets. Formally, these operations of multiplication by g** and 
Zr are analogous to raising and lowering the indices on tensors, but we 
will see that the Christoffel symbols, in general, are not tensors. 


Sec. 31] CHRISTOFFEL’S SYMBOLS 77 


From (31.1) we readily deduce an expression for the partial derivative 
of the fundamental tensor g, in terms of the symbols of the first kind. 
It is 


ag.. 2 
(31.6) “Sü = [ik, j] + Lik, il, 
Ox 
which can also be written as 
02;; E | a. | 
Bl.7 asi ag ae ie 
( ) Oxk Bai) iy ae Sui jk] 


if we note (31.5). An analogous formula for the partial derivatives of the 
contravariant tensor g” can be obtained by differentiating the identity 
ging” = ôl with respect to z”. We get 


0g; ; dg” 
ee ys oes 0, 
at ot Sia a 
Or 
g; Og” L go Bia 
ta 3x” Oak A 


To solve this system of equations for dg”/dx* we multiply both sides by 
g” and get 

Og”? 
Ox" 


; , Og; 
= — gb gti 2t. 
E E Jx" 


PAE 


Since g”g;„ = ôf, we have 


ə Bi : : 
CS — —g' gri([ik, a] + [ek i), 


ox 


where we made use of the formula 31.6. Noting the definition 31.2, we 


have finally 

dgh’ i J aj 

= e" d] ~p 
which is the same as 

e a e a e ha 
31.8 2o Cem ar —_- ai a 
Le dx" 2 akl. = ak 


We conclude this section with a derivation of the formula for the 
derivative of the logarithm of the determinant |g,,|; this will be useful to 
us in writing a compact expression for the divergence of a vector field, as 


well as in several other connections. 
The determinant g = |g;,| can be expanded by minors to obtain 


(31.9) g=gaG" + EO ee ZinG'", (no summation on i or n), 
= (9. 5 (sum on « only, i fixed), 


78 TENSOR THEORY [CHAP. 2 


where G“ is the cofactor of the element g,,. Since the g,,’s are functions of 
al, ..., 2", the G’”’s are also functions of the same variables. From (31.9) 


we deduce that 


Og = 0(gi4G"*) 


Ogi; Ogi; 
= T + G72 Ogia (sum on « only, i fixed). 
08%; Ogi; 


Since G* contains no g,;, 0G’*/dg,; = 0, and since the g,,’s are independent 
variables in this formula, 0¢;,/0g;; = 62. Thus 


Og = G6; = G". 


08%; 
But 
Og _ 98 Iga _ Gap Bap 
Ox Ogag Ox Ox" 
and, if we recall that g”? = G” /g, the foregoing formula becomes 
08 L ggah [Sab 
Gr Ox" 


If we now insert for dg,,/0x' from (31.7), we get 


Therefore we can write l s = a and hence 


2g oi 1m 


(31.10) | log Jg = (et. 


We close this section with some remarks about different notations used 
for the Christoffel symbols by various authors. The notation [ij, k] for the 
symbol of the first kind is fairly universal, but there are several different 


i k 
notations for the symbol i Thus, many writers use the symbol {ij, k}. 


Sec. 32] TRANSFORMATION OF SYMBOLS 79 


P. Appell, in Traité de mécanique rationelle, vol. 5, uses H for the symbol 
ij 
of the first kind and 4 for the second kind. The followers of the Prince- 
ton school generally use the symbol If; for the symbol | adopted in this 
7 


book. Although the notation I‘, has some advantages, it suggests that 
the symbol of the second kind is a tensor. This, however, is not always 
true, as will be seen from the developments of Sec. 32. 


Problems 


Oi; Bix ah a 
1. Show that ae ae Lyk, i} — Li, of 
2. Show that, if g;; = 0 for i # j, then (| = 0 whenever i, j, and k are 
distinct. Ai 


3. Show that, if g;; = 0 for i # j, then 


Waa? lt tile @ i 1 gss 
a = 5 agi ES ‘| = agp OBS (| ~ O Igu Ont’ 


where we suspend the summation convention and suppose that i # j. 
4. If|2,;| = 0, show that 
a) onai Blo 
ite ae a (Bj, a] + Ley, BD. 


5. Ify? = a,‘x’ is a transformation from a set of orthogonal cartesian variables 
yŻ to a set of oblique cartesian coordinates x* covering Es, what are the metric 
coefficients g;; in ds? = g;; dx‘ dx’? 


32. Transformation of Christoffel’s Symbols 


We have already remarked that the Christoffel symbols do not, in 
general, represent tensors. In this section we deduce the laws of trans- 


k 
formation for the sets of functions [i,k] and ‘ik under coordinate 


transformations yê = y‘(z!,..., x”), which will from now on belong to 
the class C2. The functions g,,(z) are assumed to belong to class C, and 
their transforms to the Y-coordinate system are denoted by the symbols 
h,,{y), so that 

aat Oa! 


5 "hi = — . - Sap: 
(32.1) Eaa 


80 TENSOR THEORY [CHAP. 2 


Let us construct the Christoffel symbols ,[i/, k], where the index y signifies 
that they refer to the Y-coordinate system; then 


1/Ohy, Oh Ohi; 
32.2 „lij, k] = (Z4 z Ma), 
Differentiating (32.1) we get 
Ohis _ ( Gu one cr = Ox" Ox? Ox” Og, 
aye  °*\ay¥ dy! dy? Dydy Dy)  dy* dy? Oy" Ox’ ` 
Since g,5 = Zs. We can interchange the dummy indices « and f in the 
second term within parentheses and obtain 
Oh; ( Gaa Ox" Ca =) Ox Ox? Ox! gup 
aye \ay* dyt dy? 3y" Ay? ay’ dy’ Oy? dy* Ax’ 
The partial derivatives 0h,,/dy' and 0h;,/dy’, entering in (32.2), can be 
obtained from this formula by a cyclic permutation of indices, and the 
substitution in (32.2) yields 
Dx Ox? Aa? j g'z" Ox? 
dy’ dy’ dy" * dy’ Oy? i Ay! 
which shows that [«f, y] is not a tensor unless the second term on the right 
vanishes. The second term will vanish identically if the coordinate trans- 
formation is affine, that is, if yf = c;’x’ and the c;’’s are constants. 


Similarly, we can easily show that the Christoffel symbols of the second 
kind, are not tensors in general. Indeed, we note from formula 31.2 that 


(32.3) Abe sh Laps 


k T 

(n = WM lij, u), 

y1] 
where 
Di dy" o 
Cie 0a” 
If we multiply (32.3) (with k replaced by u) on the left by A*" and on the 
right by its equal from the formula written just above, and simplify, we get 


hh = 


k oy" ox” Ox? go! y” 7x bs 
da= Ox? dyi m elx, y] + Si Ox? Dyt dy? ae ee 
Thus 
k a a 
(32.4) Wass (e) + o h 
vlij) dx? dyt dy’ slaf) Oy dy’ da* 


which shows that the symbols of the second kind are not tensors unless the 
coordinate transformation is affine. 


Sec. 33] COVARIANT DIFFERENTIATION OF TENSORS 81 


The system of equations 32.4 can be solved for 0?*/dy’ dy’ as follows. 
Multiply (32.4) by dx”/dy*, sum with respect to the common value k = y, 
and obtain 

p ete xt x" By? 

vlij dyt Oy’ Jy? dx 


Since dx”/dx? = ô” and dx™/dx* = öy, this expression yields 


a 


(32.5) 


orem _ [y\ax™ _ jm \ Ox" Ax? 
dydy? Nijl dy” slap) dyt dy’ 
Obviously y and x can be interchanged, and it follows from (32.5) that 
(32.6) ie hate = Wet 
Oxi Oxi Aij) Ox? lap) Ox" Ox’ 


The important formulas 32.5 and 32.6 were first deduced in an entirely 
different way by E. B. Christoffel in a memoir concerned with a study of 
equivalence of quadratic differential forms.? We will make use of these 
formulas to define the operations of tensorial differentiation. 


33. Covariant Differentiation of Tensors 


0 
We have observed, in Sec. 22, that the set of partial derivatives = P 
x 


of a scalar function f(t, ..., x°), represents a covariant vector, since 
g . o (0 

of = A = _ But if we form the set of partial derivatives e of the 
dy’ Oxy a dy’ \dy' 
covariant vector = , we get 

Oy 

ee M (2 2) 

ay dy dy’ \dx* dy’ 


__ of about , of Oe 
T Jar ax? dy’ y? dat dy’ dy’ 


2y% 


0 x 
which, because of the presence of the term of —— , shows that the set 
Ox dy’ Oy’ 


2? ; ; 

of second derivatives m does not transform according to a tensorial 
y oy 

law. It follows from this example that the set of partial derivatives 


® E. B. Christoffel, Crelle Journal, 70 (1869). 


82 TENSOR THEORY [CHAP. 2 


of a covariant vector, in general, is not a tensor. Indeed, if we have 
a covariant vector A,(x), then 

Ox" 

B«y) = F 


oy 


a?) 


and 

OB, dx% dx’ QA 072% 
33.1 — = + E Al 
SE Oy? y dy’ ax® Oy dy’ 


so that the derivatives of a vector do not form a tensor unless the coordinate 
pas A 


transformation x’ = 2'(y) is affine. If we insert in (33.1) for 
the Christoffel formula 32.5, we get 


AT from 


OB; dx*dx* 0A, A Ox" Be A 
| = Sees A A 
dy? By? Ay’ Gx? ` vijl ðy © alyp) dy’ dy’ 
a ; 
Since By? A, = B, we have on rearranging 
ee, 
(33.2) ar em = (4: - 7 \4,) eS 
Oy alij Ox? — sla B dy’ dy’ 


oA, 
from which it is clear that the set of n? functions = — ae obeys 


the law of transformation for a covariant tensor of rank two. This leads 
us to formulate a " 


0A, 
DEFINITION 1. The set of n? functions api — | \4, defines the covariant 
“ y 


x’ derivative (with respect to g,;) of the covariant tensor A;. 
We denote the covariant x’ derivative of A; by the symbol 4; ;. Thus 


(33.3) A= {| Ae 


Ont ij 


It should be noted that in order to compute the covariant derivative 
it is necessary to have the set of Christoffel symbols; that is, the funda- 
mental tensor g,; must be given in advance. 

Similarly, if we start with a contravariant vector A“, and differentiate 


i dy? 
the relation By) = aa A*(«), we obtain 


OB 0A Da? dy! | 4. Oy aa! 


dy? Ox? dy! Aa? da" Ax? dy?” 


Sec. 33] COVARIANT DIFFERENTIATION OF TENSORS 83 


and making use of the formula 32.6, we find 
a (E (ie {aloe 
dy? uly Ox? slyp) Jy ð 


é T PLIA i 
Thus the set of n? quantities A(i, j) = Dal + lei” forms a mixed tensor 


of rank two. Accordingly, we introduce a 


. 0A’ i 
DEFINITION 2. The set of n? functions Fai + [ia represents the 
x xJ 


covariant x’ derivative (with respect to g,;) of the contravariant tensor A’. 
We denote the covariant 2’ derivative of the contravariant tensor A’ by 
the symbol A‘;. Thus 


(33.4) ee, | avo 
Ox? ajl 


The definitions 33.3 and 33.4 can be extended, in an obvious way, to 
mixed tensors. Thus we define the covariant 2’ derivative (with respect to a 
given tensor g,,) of the mixed tensor Ajid by the formula 


. S ji nee js 
(33.5) Ait = PAn 
Ox! 
F (ajaa ae es . ee 
ly Ist il 


jad aage j e Haa I fe yo OO: 
-- A; er 8 ARSS s O90 An 7 
4 SO ig E a 


A verification of the fact that the set of functions Aji: i (x) forms a 
tensor of the type indicated by the indices presents no difficulty. 

If A is a tensor of rank zero, we define its covariant derivative to be the 
ordinary derivative. Thus A, = 0A/dx'. This definition is consistent 
with the formula 33.5. We also note that, if the g,,’s are constants, the 
Christoffel symbols vanish identically, and hence the covariant derivatives 
reduce to the ordinary derivatives. This will surely be true if the g,;’s are 
the metric coefficients of an Euclidean space covered by a cartesian reference 
system. 

We remark in conclusion that the covariant x! derivatives of relative 
tensors are defined as follows. If f(x) is a relative scalar of weight W, 


so that g(y) = f(x) i 


0 


w 
, then 


y? 
(33.6) l fis &- wle). 


84 TENSOR THEORY b p 


Js 


This set of functions represents a relative vector of weight W. If ApS at 
is a relative tensor of weight W, then its covariant x’ derivative is a relative 
tensor of weight W, determined by the formula 


Problems 


1. Prove that the following expressions are tensors. 


ine i\ jad 4 J | gia 
(a) ob ar! al al l 


a {x 
(Aisa = ge e — | jal Ae 
oA, a F 
(d) Aini = aa Es ee ae ee = ee ar 5 Nai. 


k k 
2. Provewtnat |.) at Ore components of a tensor of rank three, where 
Ui b 


‘i and (| are the Christoffel symbols formed from the symmetric tensors 

all] ol! 

a; (x) and es ’ 3 Jar ayt ax? | ay? 
3. Use the formu a on oT Dy% | ox 


mation of relative e of weight W to deduce formula 33.6. 


and the law of transfor- 


34. Formulas for Covariant Differentiation 


It is easy to deduce from the structure of formula 33.5 that the rules for 
covariant differentiation of sums and products of tensors are identical 
with those used in the ordinary differentiation. Indeed, if AB: ‘4s(x) and 
Sia’ (2) are two tensors, then the formula 

(ABE pad) = AR iy peal 


ip. 


follows directly from e of via To prove that the derivatives 


Sec. 34] FORMULAS FOR DIFFERENTIATION 85 


of the outer and inner products are given by the familiar rules, 


Uy PORE oA Ares 2 oo 8} L choo) Jara, Fo 8 OR A OY 
(Ae Che ig Aa ae An Are hes just = Ags. . EA A 


tr41°°* tr+1°'’ tr+1 
ji’*" Fg—1@ E A ) — ji" Jjs—1% Jsi dw 
(APE E E E A T A ae 


ig Air. en ** fo 
a Do 


tr tr+1 iy—14,l? 


v 


we need only insert for A in formula 33.5 the product A.. We illustrate 
the procedure by considering the product Ase, , = We. We have 


a gyi x a x aay 
3 <= 2 12 J13 
Q? 122 ata DE =— DEE 


ael oi i, iol 
Es {lat ae [alaras 
al al ; 


= o (as — | $ Jalas = | # sf) 
Ox! il ial 


QAM e 4 a 
A. f ! naes + | \a*) 
oe 1 Ox! as al al 


At: jije 
= AA iet F Aii A 


This establishes the desired result. As an exercise the reader may show 
that | l l 

(Ajak) = AA + Aak r . 
He can also show that the operations of covariant differentiation and 


contraction can be permuted. i 
We conclude this section by remarking that in covariant differentiation 


the Kronecker deltas behave like constants. Indeed, from (33.5) we have 


Be Ge la 
it 3 \jl at all’ 


pte a 


Problems 


1. Note that the operation of contraction of indices A% is equivalent to 
multiplying 47; by ôi. Using this, show that the operation of contraction can 
be performed on a tensor either before or after covariant differentiation. 

2. Show that the operation of raising or lowering of indices can be performed 
either before or after covariant differentiation. 


86 TENSOR THEORY [Cuap. 2 


35. Ricci’s Theorem 


We will show in this section that the fundamental tensors gą; and g” 
behave in covariant differentiation as though they were constants. This 
follows from 

Rıccrs THEOREM. The covariant derivative of either of the fundamental 
tensors iS zero. 

Proof. Consider first the tensor g;; and form 


Ogi; {a a | 
aat Sj F "al 
The right-hand member of this expression vanishes identically by virtue 
of (31.7), so that g;;, = 0. 

We can perform a similar calculation for the tensor g”, but it may prove 
more instructive to differentiate the inner product g'*g,; = ô}. Thus 


Cee 8 a Ope 
since 6}, = 0 and g,;; = 0, we have 
Bagi = 0. 
However, since |g,;| # 0, the only solution of this system of homogeneous 
equations is gf = 0. 
As an immediate corollary of Ricci’s theorem we note that the funda- 
mental tensors may be taken outside the sign of covariant differentiation, 


and hence the operations of lowering and raising indices are permutable 
with covariant differentiation. Thus 


(2.i Aji), = gyi Ay 


Eija F 


36. Riemann-Christoffel Tensor 


We recall that a sufficient condition for the equality of mixed partial 
2 


ou gu 
derivatives —— and —— of a function u(x, y) is that u(x, y) be of class 


Ox dy dy Ox 
C*. We will assume henceforth that the tensor components under con- 
sideration belong to class C?, but this restriction alone, as we shall see 
presently, is not sufficient to insure the equality of mixed covariant 
derivatives. Indeed, it will be shown that, if the order of covariant differ- 
entiation is to be immaterial, our tensors must be defined over a particular 
metric manifold X for which a certain tensor of rank four, made up 
entirely of the g,;’s,vanishes. This tensor, known as the Riemann-Christoffel 
tensor, plays a basic role in many investigations of differential geometry, 
dynamics of rigid and deformable bodies, electrodynamics, and relativity. 


SEC. 36] RIEMANN-CHRISTOFFEL TENSOR 87 


The covariant derivative of a tensor is a tensor; hence it can be differ- 
entiated covariantly again to obtain a new tensor. This tensor is called the 
second covariant derivative of the given tensor. 

Consider the covariant x’ derivative of A, with respect to g;,;,° 


(36.1) A; = aa ae 


Now, if (36.1) is differentiated covariantly with respect to x*, there results 
the tensor 


(36.2) Ain = es [Aaa — (9 Nai 
Hila 
-9 tae 
Or \ de ij ik Vox’ xj] 
On the other hand, 
ee ae) (alle ~ leat“) 
Co di ax7\dx* ik) * ij) \Ox* ak} ” 


- (aaa ~ Vad”) 


Carrying out the indicated differentiation in (36.2) and (36.3) yields 


x 
aA E T E 0A, h 0A, 
l 


(36.4) Aj, jn = i Jak TE 


oa 
i 34A; 4 A vA fa in - (Ve 
(COS) Autor i dxi * lik} ax? ij) 0x" 


+E lo Ghee 


If we subtract (36.5) from (36.4), we get 


sn tne (Jo fea la 


ik ox" 


88 TENSOR THEORY [CHaP. 2 


and an interchange of « and f in the first terms of each preceding line gives 


eA. 

ik | aoe 1 
36.6) A; %— Az; = |—— + = Ax 
(36.6) Aine Airs ax? dat ` \ikf\g) — afl Bk] 

Since A, is an arbitrary covariant tensor of rank one, and since the 
difference of two tensors A; j — A; zj is a covariant tensor of rank three, 
we know by the Quotient Theorem I of Sec. 26 that the expression in the 
bracket of (36.6) is a mixed tensor of rank four; that is, 


a a 
id Mal fay Wier 
ax? a CAs ay) ag lp) ” 
Furthermore, if the left-hand member of (36.6) is to vanish, that is, if the 
order of covariant differentiation is to be immaterial, then 


Re, = 0 


since A, is arbitrary. In general, however, Rj, 4 0, so that the order of 
covariant differentiation is not immaterial. It is clear from (36.6) that a 
necessary and sufficient condition for the validity of inversion of the order of 
covariant differentiation is that the tensor R}, vanishes identically. 

The tensor i i a) 


a aala a Oe 
3x ax! ak} lal ie 


(36.7) Ria = 
g ia 


ao 


is called the mixed Riemann-Christoffel tensor or the Riemann-Christoffel 
tensor of the second kind. 

The associated tensor 
(36.8) Riju = ZiaRjr 


is known as the covariant Riemann-Christoffel tensor, or the Riemann 
Christoffel tensor of the first kind. 

It is not difficult to verify that the defining formula 36.8 for R; can be 
written in the convenient determinantal form 


a alal Lid Gi 
(36.9) Roa os =a | + | Nike jl 
Lik, i] (jl, i] [lik,«] [il, æ] 


which will be found useful in listing properties of this tensor in Sec. 37. 


SEC. 37] RIEMANN-CHRISTOFFEL TENSORS 89 


We remark in conclusion that formula 36.6 is a special case of an 
identity, established by Ricci, which we record here without proof, 
although the nature of proof is quite clear from the proof of the case 
treated previously. This identity reads z 


m 
_ h 
A; S] A;, tettm ki — >» AG «edgy yhiggi'** ipl tase: 
a=1 
In the special case of a tensor of rank two it assumes the form 
— x a 
Aij — Ain = Aj Rig t Aa;Rikr 


Problems 
1. Show that 


R a E la ik 
ijkl = ALA =U si ate \ jk il, a] = jl [i „edh 
2. Show that 
( 3g, 89 jy J Oe iy H 5) 
at g™ (Lk, Bil, a] ai [jL Bllik, al). 
3. Using the formula of Problem 2 show that 
Riser = Rj = -Rije = Rerii 
and Riser + Rir + Rine = 9. 


4. If ¢ is a scalar, then g$ ;; is a scalar and is equal to 


1 ô ek 
= ( V gE *) ; 
Vg ðr dar) 
5. Referring to Problem 4, show that g/¢ ;; = 0 reduces to 2p] Ox? Ox? = 0 
when the g;; are the metric coefficients of E, referred to a cartesian frame. This 
implies that Laplace’s equation in general curvilinear coordinates has the form 
&''b, i; = 0, since this is a tensor equation. | 
6. Referring to Problem 5, show that Laplace’s equation in polar coordinates 
has the form 
ad 1 2¢ 1 a4 2 ô$ 


ee et aa tT 
Cape + Ge a? TG sin? OP. By" 


——= t 2 tie =0 
yp oye 
37. Properties of Riemann-Christoffel Tensors 


From defining formula 36.7 for a mixed tensor Rip, we see immediately 
that the set of functions Rip is skew-symmetric with respect to the last two 
covariant indices. Thus 
(37.1) ‘xi = — Rim 


and hence Riaya) = O 


90 TENSOR THEORY [Cuap. 2 
We have defined the covariant tensor R;,,, by the formula 
Riju = Sioa 
and, if we multiply this equation through by g” and sum, we get 
(37.2) Roy = g” Rip 


so that the Riemann-Christoffel tensor of the second kind is obtained by 
raising the first covariant index in the tensor R; To determine the 
properties of the set of functions defining the Riemann-Christoffel tensor 
of the first kind we expand the determinants in (36.9) and insert for 
Christoffel’s symbols in the first determinant the definitions 31.1. We get 3 
after a simple calculation the formula 4»! 


fag St pgs \ss 
20. 2y. g? 7 a ; \ \ 
GII RAS yie _ Dgn _ 8a a) 


~ N\aeiaa* daxtdx® dxidx’ Prr 
ats g*(Lik, gllil, a] i (jl, pllik, a]), 


from which it is obvious that 


(a) Rir = — Rir 

(b) Row = — Reger: 

(c) Ryris = Reser 

(d) Ror + Rian + Rise = 0. 


The last identity can be verified by direct substitution; by raising indices 
we obtain an identity analogous to (d) for the mixed tensor Riz; 


(e) Rin T Riy a5 Rix = 0. 


(f) The components of a Riemann-Christoffel tensor with more than two 
like indices are necessarily zero. The identities (a) and (b) state that the 
tensor R;;,, is skew-symmetric with respect to the first two and last two 
indices, and the identity (c) signifies that R;,,, is symmetric with respect to 
groups of first two and last two indices. It follows from these identities 
that distinct, nonvanishing components of R;;,, are of three types: 


l. Symbols with two distinct indices, that is, symbols of the type R 
2. Symbols with only three distinct indices, which are of the type R 
3. Symbols R,,,,, with four distinct indices. 


ijii* 


ijik" 


It is now an easy matter to verify? that the total number N of distinct non- 
vanishing components of R,,,, is N = n?(n? — 1)/12. 


° There are n, = n(n — 1)/2 distinct nonvanishing symbols of the type R 


— | = — — — 
= n(n — Da — 2) of the type Rom and ng = n(n — 1(n — 2)(n — 3) 
2 12 
type Rijki 


ijijs 


nz of the 


Sec. 38] RICCI TENSOR. BIANCHI IDENTITIES 91 


In a three-dimensional space, distinct, nonvanishing components R,j,, 
have the suffixes: 1212,1313, 2323, 1213, 2123, 3132, and in two dimensions 
from the total of 24 = 16 components there is only one distinct non- 
vanishing component: R,» We will see that this tensor characterizes an 
extremely important property of surfaces. 


38. Ricci Tensor. Bianchi Identities. Einstein Tensor 


We define the Ricci tensor R; by the formula Ry = Rpa which, by 
virtue of (36.7), can be written as 


o 2 fa) [| 
dx? Ox" A (pi) Bal 
S as 


ð 
In Sec. 31 we have shown that — log Vg = | a, so that 
Ox' lie 


R 


Ry; = 


X 
p= Zie U (e) (P) (2E. 
7 Ox? Ox" Ox" Bj) Nic lig) Aa? 


From inspection of this result we see that the tensor R; is symmetric. 
Since R, = R; the number of distinct components of R, is In(n + 1). 
In a four-dimensional manifold n = 4, so that, if we set R; = 0, we obtain 
ten partial differential equations, which Einstein has adopted as his equa- 
tions of the gravitational field in free space in the general theory of rela- 
tivity.1° In the development of that theory another tensor, introduced by 
Einstein, plays an important role. This tensor is most readily obtained 
from the identity 


(38.1) Riya F Rint Y Rinki = 0, 


due to Bianchi. 
Since the covariant derivative of the fundamental tensor g; vanishes, 


the Bianchi identity can be written in the form 
(38.2) Riam + Rima + Rimsa = O. 


If we multiply equation 38.2 by gi'g’* and make use of the skew-symmetric 
properties of the Riemann tensor Rijxv WE get 


P Bite — g Rim — E Rmi 0. 


10 See Problem 2. 


92 - TENSOR THEORY [Cuap. 2 


This result can be written as 
Ra Pa PR = 0, 
where R = g” R, or in alternative form 


(38.3) (RE ae, ae — 0 


m 


where RE = 8" Rim The tensor 


in parentheses in equation 38.3, is known as the Einstein tensor. 


Problems 

1. Show that R%,, = 0. 

2. If Ri; = pgi; then p = R/n, where R = g” R;;. (The equation R;; = pgi; 
is known as the Einstein gravitational equation at points where matter is present. 
It corresponds to the Poisson equation V°V = p in the Newtonian theory of 
gravitation.) 

3. If n = 2, show that Ry/ei1 = Roe/g22 = Riel Sie = — Ring - ; 

4. If n = 3, the tensor R;;ąı has six distinct components, and there are six 


equations R;, = g"R,,,;. Prove that the solutions of these equations for R;;jxı 
are given by 


R 
Reset = Sirk + EirRir — iR — EaR t 2 Sir§ jt ETI 


where R = g” Riz. 
5. Verify Bianchi’s identity 38.2. 


39. Riemannian and Euclidean Spaces. Existence Theorem 


Let the n-dimensional space V,, be covered by a coordinate system X. 
We will metrize V,, by prescribing the element of arc ds, so that 


(39.1) ds? = g, dx‘ dz! 


is a positive definite quadratic form in the differentials dr’. The functions 
Z;(Œ) are assumed to be of class C! in V,,. The space V, so metrized is 
called a Riemannian n-dimensional space R.,,. 

We will now consider in some detail the following question: What 
restriction must be imposed on the symmetric tensor g, (x) so that there be a 
coordinate system Y, defined by 


iP. Ye =a ie. wae), (i = E en) 


with y'(x) of class C? in R,,, in which the tensor g, (x) has constant components 
h,; throughout R,,? 


Sec. 39] RIEMANNIAN AND EUCLIDEAN SPACES 93 


This is one of the basic problems of differential geometry, which occurs 
also under a different guise in dynamics, elasticity, relativity, and other 
branches of applied mathematics. : 

We note first that the components of g,,(v), when referred to the Y- 
frame, are given by 


dx” ox? 
39.2 Paen 
( ) lij Toa 


k Ti j 
If h,;s are constants, then the Christoffel symbols i vanish identically. 
yl 


k Oh, ; : 
Conversely, if the į.) vanish identically, h;;; = =—, and, since h,;, = 0 
vlij a oy’ í 


by Ricci’s theorem ,we have dh,,/dy’ = 0 in R,. Consequently, the h,; are 
constants throughout R,,. This permits us to state a 

THEOREM I. A necessary and sufficient condition that the metric coeffi- 
cients g,,(x) reduce to constants h,; in some reference frame Y is that the 


Christoffel symbols ‘i vanish identically. 
yil 


From this theorem we can deduce at once a system of differential 
equations that must be satisfied by functions y‘(z’, . . . x”), if there is to be 
a coordinate system Y in which the A,,;’s are constants. The law of trans- 
formation 32.6 demands that 


— ("|e = P ie 
slap] ðr dai Ər OLİ lij Ox?” 


m 
and, since | = 0, we have the system of equations 


v\ap 


Fy” [n dy” 
E), 
Oe Ox" Ox? ij) Ox” 


i 


in which the symbols | | are formed from the g,,(x). The system 39.3, of 


second-order partial differential equations, can be rewritten in an equivalent 
form as a system of first-order partial differential equations 


es Ep (i = 1, 2,. Sn) 
GEH 
(39.4) 3 
H = 1,2 n) 
= Uy, 34> , 
dx? 7] 4 g 


This system, in general, will be incompatible, and we now turn to the 
determination of the necessary and sufficient conditions for the existence of 
solution of the system 39.4. 


94 TENSOR THEORY [CHAP. 2 


In order to phrase these conditions in a symmetric form, we will consider 
the system 


(39.5) aie Gf)... 5h eee 


Ba), @= le): 
Ox 


Gi=122 en 
where the F,* are known functions of the f’s and x’s. Equations 39.5 
specialize to (39.4) if we set f! = y, f? = u,,...,f" =u,. The functions 
F; are defined over the n-dimensional region R and for arbitrary values of 
the functions f’, that is, for —co < f' < œ. Let us refer to the region of 
definition of functions F7 as R’. This region consists of the region R of the 
variables z’ and the set of ranges 


—-o<fi<ca. 


We will suppose that the functions F,* are of class C!in R’. Since the region 
R’ is open, we will assume that the 0F7/0f? are bounded in R’. The 
restrictions imposed on the choice of functions F;* are clearly satisfied by 
functions appearing in the right-hand members of equations 39.4. 

Since the F* are of class C! in R’, it follows that the f’s are of class 


C?, and hence 
2fa 2fa 
(39.6) of = of 7 
ox On Or Ox, 
This is a necessary condition for the integrability of the system 39. 5. Dif- 
ferentiating equations 39.5 with respect to x’, we obtain 


of" Oy, Oba 
oxox? ‘ore! af? dx 
= i pai FP, 
of 
where the last step results from the substitution of the expression for 
of’ /dx’ from (39.5). Now, if we form (39.6), we get as a necessary 
condition for integrability the set of equations 


(39.7) OFF 4 oF, FP = a pa JE: Fe ((qqepos 1, .. . yun); 
O oÉ of? c= to) ae 

We see that if the system 39.5 has a solution, then either (39.7) are identities 
in f* and «' or else there are certain functional relations existing between 
the f’s and 2’s. If (39.7) are identities, the system of equations 39.5 is said 
to be completely integrable. Itis then possible to prove that the integrability 


conditions (39.7) are not only necessary but also sufficient to guarantee the 
existence of solutions of the system 39.5. 


Sec. 39] RIEMANNIAN AND EUCLIDEAN SPACES 95 


There are several proofs of the existence of solution of complete systems 
of partial differential equations; perhaps the simplest of these was given 
by T. Y. Thomas in 1934 in a paper entitled “Systems of Total Differential 
Equations Defined over Simply Connected Domains,” Annals of Mathe- 
matics, 35, 730-734 (1934). An earlier proof, assuming the analyticity of 
functions F,*, was given by Bouquet! in 1872, and there are other proofs 
by G. Darboux and E. Cartan. We shall not go into a discussion of the 
sufficiency of conditions 39.7, but will merely state an 

EXiSTENCE THEOREM. Let R be an open n-dimensional simply connected 
region referred to the X-system of coordinates, and R' the region composed 
of R and the ranges —œ < f' < œ. If the functions F,'(x, f) are of class 
C1 in R' and have bounded derivatives OF,'/0f’ in R', and if furthermore the 
integrabiltiy conditions 39.7 are satisfied identically, then the system 39.5 
has one and only one set of solutions 


Toe... ee), (= 1,...,m), 
which for an arbitrary set of values (xo, .. . , 9") take on the arbitrarily 
prescribed values C* = f*(xo', . . . , £o”). 
We will now apply these results to the special case of the system 39.4 by 
identifying it with (39.5). 
The dependent variables in (39.4) are y, u,...,u,, Whereas in (39.5) 
mey are f4,f%,...,7". hus we set 
fi=y, fee sf = Uy, 


and the system 39.4 reads 
of" = F! 


and 

Lor] y (x= 2,3,...,n + 1), 

Ox" í acl | E 1, 25... ,.n)- 
The substitution of the expressions for F,* in the integrability conditions 
39.7 gives 


(39.8) Ag Ẹ (e 


Rj jy = (0) 
The first of these sets of equations is satisfied identically because of the 
symmetry of Christoffel symbols. The second set states that the set of 


u J, C. Bouquet, Bull. Sci. Math. et Astron., 3, (1872) p. 265, G. Darboux, Leçons sur 
les systèmes othogonaux, (1910) pp. 326-335, E. Cartan, Géometrié des espaces de 
Riemann, (1928) pp. 54-57. The proof by T. Y. Thomas is quite close in spirit to that 
given by Cartan. 


96 TENSOR THEORY [CHAP. 2 


equations 39.4 will have a solution if the Riemann-Christoffel tensor Riy 
vanishes identically. Since this tensor vanishes when metric coefficients 
are constants, we can enunciate a basic 

THEOREM II. A necessary and sufficient condition that a symmetric 
tensor g, with |g;;| # 0, reduce under a suitable transformation of coordinates 
to a tensor h,,;, where the h,;s are constants, is that the Riemann-Christoffel 
tensor formed from the g;; s be a zero tensor. 

We note further that, if the quadratic form Q = h „y'y’ is positive 
definite, there exists a nonsingular linear transformation reducing Q to the 
canonical form O =(y)*--- F Thus ii the ge) rare the co- 
efficients in the positive definite quadratic differential form 


(39.1) ds? = g,, dx' dx’, 


characterizing metric properties of R,, there exists a real functional trans- 
formation T: y? = y'(a) which reduces it to the form 


(39.9) ds? = (dy')? +--+ + (dy")’, 
provided that R',, vanishes identically in R,- 

We recall that a metric manifold R,, in which it is possible to effect the 
reduction of the form 39.1 to 39.9 is called an Euclidean n-dimensional 
manifold E,,, and we see that R,, is Euclidean if, and only if, the Riemann 
tensor of the manifold is a zero tensor. 

Problems 


1. Verify the substitutions in the integrability conditions 39.7 leading to 
equations 39.8. 

2. Referring to the system 39.5, show that it is completely equivalent to the 
system of total differential equations 


df* = FF dx’. 
3. What are the integrability conditions for the equation 
P(x, y, z) dx + Q(x, y, z) dy + R(x, y, z) dz = 0? 
Consider also the system 

oF 7 oF a oF 

ax j dy j 
4. Prove a theorem: If P dr + Q dy + Rdz = 0 is integrable, then 

AP dx + Q dy + ARdz =0 


is also integrable for any A(x, y, z) of class C1. 
5. Deduce the integrability conditions for the equation 


Pi@,..., eae =a0, ( =1,72.,n). 


Sec. 40] THE e-SYSTEMS 97 


40. The e-Systems and the Generalized Kronecker Deltas 


The notions of symmetry and skew-symmetry with respect to pairs of 
indices (see Sec. 27) can be extended to cover the sets of quantities that 
are symmetric or skew-symmetric with respect to more than two indices. 
We will consider in this section the sets of quantities A^ ° *™ or A; ...;,5 
depending on k indices, written as subscripts or superscripts, although the 
quantities A may not represent tensors. 

DEFINITION 1. The system of quantities A` ` `` è (or A, ... ;,), depending on 
k indices, is said to be completely symmetric if the value of the symbol A is 
unchanged by any permutation of the indices. 

DEFINITION 2. The system A` `` * (or A, ... ;,), depending on k indices, 
is said to be completely skew-symmetric if the value of the symbol A is 
unchanged by any even permutation of the indices, and A merely changes sign 
after an odd permutation of the indices. 

We recall that any permutation of n distinct objects, say a permutation 
of n distinct integers, can be accomplished by a finite number of interchanges 
of pairs of these objects and that the number of interchanges required to 
bring about a given permutation from a prescribed order is always even or 
always odd. 

It follows at once from definition 2 that in any skew-symmetric system 
the term containing two like indices is necessarily zero. Thus, if one has a 
skew-symmetric system of quantities A;;,, where i,j, k assume values 
1, 2, 3, then Ayo. = 0, A123 = —Aois, Asie = A123 etc. In general, the com- 
ponents A,;, of a skew-symmetric system satisfy the relations A;,, = 
— Airi = — Anr = Airi = Anis = —Ayy 

Consider now a skew-symmetric system of quantities A;,...;, (Or 
Añi: ta), in which the indices i, . . . , i, assume values 1, Do. «Gite We 
define the e-system as follows. 

DEFINITION 3. Jf the value of A;...;, (or A ia) is +1 when iiz’ + i, 
is an even permutation of the numbers 12 > - - n, and —\ when iis ° + * 1, is an 
odd permutation of 12---n, and if it is zero in all other cases, then the 
system A, ...;, (or A'U ` *») is called the e-system. 

We shall use the symbols e; ...;, or e' `` *= to denote the e-systems. 
It will be shown in Sec. 41 that the e-systems are relative tensors. 

As an illustration we note that the components of the system e,; are: 


en = 0, ei = l, eo = — l, Coe = 0. If the e-system depends on three 
indices ijk, then e,,, = 0 if any two indices are alike, whereas e,;, = €123 = | 
if ijk is an even permutation of 123 and e,;, = —€123 = —1 if ijk isan odd 


permutation of 123. 
Closely allied to the e-systems are the generalized Kronecker deltas, 


which we proceed to define. 


98 TENSOR THEORY [CHAP. 2 


DEFINITION 4. A symbol Ope -13 depending on k superscripts and k 
subscripts, each of which runs Taa i to n, is called a generalized Kronecker 
delta provided that: (a) it is completely skew-symmetric in superscripts and 
subscripts; (b) if the superscripts are distinct from each other and the sub- 
scripts are the same set of numbers as the superscripts, the value of the 
symbol is +1 or — 1 according as an even or odd number of transpositions is 
required to arrange the superscripts in the same order as the subscripts; 
(c) in all other cases the value of the symbol is zero. 

As an illustration consider 6. It follows from definition 4 that if 
i = j or k = l, or if the set ij is not the set kl, then 6 = 0. In all other 
cases 6}) equals +1 or —1 according to whether k/ is an even or an odd 
permutation of ij. Thus 


C=). a, — 
1=)3— (eee 5 
—1 = ô = 63 = ôl =. 


We prove in Sec. 41 that the generalized Kronecker deltas are tensors. 
From definition 3, it follows that the direct product e't "ihe; 5.5 


of the two systems e'!’"’' and e; ..., is the generalized Kronecker 


delta. For example, e%”e;;„ has the following values: 


(a) Zero, if two or more subscripts or superscripts are alike. 

(b) +1, if the difference in the number of transpositions of ee and ijk 
from 123 is an even number. 

(c) —1, if the difference in the number of transpositions of ni and ijk 
from 123 is an odd number. 


A little reflection will show that another way of phrasing statements (b) 
and (c) is the following: 


(5) e**e,;, = +1, if an even number of transposition is required to 
arrange the subscripts in the same order as the superscripts. 

(c’) e*e,,, = —1, if an odd number of transpositions is required to 
arrange the subscripts in the same order as the superscripts. 


We can thus write 
«PY o 
E’ Ck az ORR 


It is clear from definitions 3 and 4 that the e-symbols can be defined in 
terms of the Kronecker deltas, 
tig’ iyi 12 
e a = O15 : a and Cpe = Cite: ++ iy? 
since e = +1 or —1 when ws set of distinct AES iyig** + 1, 18 obtained 
from the set 12 -- +, by an even or an odd permutation, and e = 0 in all 


Sec. 40] THE e-SYSTEMS 99 


other cases. The e-systems and generalized Kronecker deltas prove useful 
in calculations involving alternating sets of quantities. 
We consider next several examples which permit us to deduce a number 


of identities involving operations on these symbols. 3 
Let us contract 62% on k and y. The result for n = 3 is 


Özge = Onpi + Ozp2 + Ônp3 = Ông 
We observe that this expression vanishes if i and j are equal or if « and f 
are equal. If we set i =] and j = 2, we get 6133 = 6.3, and hence ô} = 
0, unless «f is a permutation of 12. In the latter case ól% = 1 if af is an 
even permutation of 12, and 6}3 = —1 for an odd permutation. Similar 
results hold for all values of / and j selected from the set of numbers 1, 2, 3. 
We thus see that 62, is equal to 


(a) 0, if two of the subscripts or superscripts are alike, or when the 
subscripts and superscripts are not formed from the same numbers. 

(b) +1, if ij is an even permutation of af. 

(c) —1, if ij is an odd permutation of aĝ. 


If we contract 64, and halve the result, we obtain a system depending on 
two indices 

òi, = 469, = 40, + 6% + 655). 
If we set i= 1 in ôi, we get ôl = 4(ò13 + 633). This vanishes unless 
æ = l, in which event 6! = 1. Similar results can be obtained by setting 
i= 2ori=3. Thus 6 has the values 


(a) 0 ifi # a, Cag = i 2, 3). 
(b) 1, if i = a. 


By counting the number of terms appearing in the sums it is not difficult 
to show that, in general, 


(40.1) 0e = T 7 Oa and = nn i): 

We can also deduce that l 

(40.2) ogg BA oti ei 

and 

(40.3) diets = n(n — 1)(n — 2)- (nortan. 


As a special case of (40.3) we have the formula 


(40.4) Mes re... = 


iyig'** ty 


100 TENSOR THEORY [CHar. 2 
and from (40.2) we deduce the relation 
(40.5) ef iter Nag ge ee EE 

Consider next a set of n?** quantities Aj! ..: $? (the i’s and j’s run from 


1 to n), symmetric in two or more indices (which may be superscripts or 
subscripts). We can show that 


DAS O° By AIOE YOS Oy 
OG, a. velop a oe ja = 0, 


if Aji.. is symmetric in two or more subscripts. Also 
ES AU 
Onts oe say Ais sande Ja = 0, 
if Ag a is symmetric in two or more superscripts. 
es — ai 
Suppose that 4;!:::;? is symmetric in jı and j,; then 


jija" ja ATi ° °° Tp — Sir32°** dati’ * Tp 
Oe. anes te ae OF, te IAG =. 9 
= — 67251" i Ja Arn ME 
iria +++ igt jaji tt je 


However, jı and ją are the dummy indices; hence 


Th Oise: «Asti = fe Ora a 
us 
One tA. = 
Problems 
1. (a) Show that out == 2) THF Lj k = loys: 
AY ôi Ôa B ô, 
y — |°m a S ; A 
(b) Show that 6’, 3 84 and 67%, =|6) 6, ð 
a k sk sk 
OF ôk ô 


2. Expand for n = 3: 
(a) 616%. (b) d}Paty). (e) ryt, 
(d) duty, (e) di. 
3. Expand for n = 2: 
(a) eřała?, (b) e*a?a}. EO) e*Fatai, = e jal. 
4. If a set of quantities A; ...;, is skew-symmetric in the subscripts (k in 


number), then 
Oi te * 
pare Ae. 


oe fill ch F 
is KIA, ...;. 


5. If A;;, is completely symmetric and the indices run from 1 to n, show that 
the number of distinct terms in the set {4;;x} is 


n(n — 1)\(n — 2) 


N=n+n(n —1) + F 


Sec. 41] APPLICATION OF THE e-SYSTEMS 101 


Hint: Consider the cases where the subscripts ijk are all alike, when only two 
are distinct, and when all are distinct. 

6. Show that the number of distinct, nonvanishing A;;,’s in Problem 5 is 
n(n — 1)(n — 2) 
= = 


v 


when 4;;; is completely skew-symmetric. 


41. Application of the e-Systems to Determinants. Tensor Character 
of Generalized Kronecker Deltas 


We recall that the determinant |a! of nth order, with elements ai, con- 
sists of the sum of products of the elements where each term in the sum 
contains one and only one element from each row and each column of 
the determinant. The sign of each term in the sum is determined by the 
character of permutation of the indices. Thus, if the superscripts in the 
product aj az,’ <a, are arranged in the normal order 12---n, then 
the product will carry the plus sign if the number of transpositions neces- 
sary to arrange the subscripts in the normal order is even. The sign is 
minus if the required number of transpositions is odd. Since e* "°° "* = 


Side in and O}23°11%, = eii,- ip the determinant 
a. Gas ge 

(41.1) jap a t Sala 
dy, ag °°" Ay 


can be written compactly as 


— liir’ ingl G2... 
a=e "a; ai a; 


(i2) 1,12 in 
a Ciria’ inGl a> ay, 
As an example consider 
' OG aa CR 
i 2 canes 
lat} = |a ag a3|=a. 
ca) aa. 


If this determinant is expanded by columns we get a = = + dalat, 
where ijk is a permutation of 123. The plus or minus sign is assigned to the 
term alaja according to whether this permutation is even or odd. 
Hence this determinant can be written lai| = e,;,a,a3a. On the other 
hand, if it is expanded by rows, we can write Jai] = e*ajajay. 

Consider next the sum 


C jp VAAys (i, of k, a, B, oS i Phs 3). 


102 TENSOR THEORY [Cuap. 2 


We will show first that this system is completely skew-symmetric in «Py. 
Since the indices ijk are dummy indices, we can change them at will and 
write ie TE 

eaaa} = epah, = eyja ahay. 
If k and i are interchanged in e,,;, this e-symbol will change sign, and hence 


a CAE ini gk 
eja QA, = — E ijp, ahay 


This shows that an interchange of « and y changes the sign, so that the 
system under consideration is skew-symmetric in « and y. Similar results 
obviously hold for other indices. A special case of this system is the 
determinant |a| = e,;,a\aja3, and it follows from the foregoing that 


Tink — |i 
C 55444024, = |G;| Cag,» 
Similarly, we can show that 
tikgtpboayY = lg) eXPy 
e”tazaja; = |a] e”. 


It follows at once from these expressions that an interchange of two columns 
(or two rows) of the determinant |a,’| changes its sign, and if two columns 
in it are identical, then its value is zero. 

These results can be immediatley generalized to determinants of nth 
order, so that for any permutation of rows we can write 


(41.3) es" Y lai] = ef “"Ragae +++ al, 
and for any permutation of columns 


(41.4) eij. AA — G E oe ay 


We use formula 41.4 to establish the formula for the product of two 
determinants. The power and compactness of this notation are strikingly 
demonstrated in this derivation. 
Since |b| = e,,...,b1b3 «+ bt, we can write 
la;l < [Bj] = la} Giz... O13 bi, 
= (e aa « a (b {bh TDR), 
where we have made use of the formula 41.4. Thus 
lail + [bj] = eng... ,(aZbi)(afb3) +- - (ayb 
= Ici, 


where 
i ipa ipl i pe 7 
Cj = ab; = ayb; + aby + --- + al be. 


Sec. 41] APPLICATION OF THE e-SYSTEMS 103 


The expansion of the determinant in terms of the elements of the first 
column and their cofactors can be written 


(41.5) laj| = aye, ip i, AP AR * 
me ey Ml 
aa ajA 
where A} = e,,,...,, aais-+- air is the cofactor of the element ai. 
We derive next the formula for the partial derivatives of a determinant 


whose elements a,’ are functions of the variables 71, 2°,..., æ”. From 
formula 41.2 we have 


eae) 


= alas = « ft 
a= 6; 5,...4,0az avr. 


Differentiating this expression, we get 


da Oay" jis pe ee. fa: 
SPO eign Th iS ee an + 
x x 
da" 
+ ata 3 z > 
x 
Gai) n Odes. fata 
= Ay, + ae tT go eta 
Cx’ Ox Ox 
a 
= aş 
oxi * 


by formula of the type 41.5. 

Formulas 41.3 and 41.4 permit us to establish the fact that the permu- 
tation symbols e's’ '" ande,,... ;, are relative tensors of weights +1 and 
—1, respectively. 

Consider an admissible transformation 


T y = Te, bh seri a"), 


_ oy. 
and its Jacobian J = ey . If we set aj = a in formula 41.3, and recall 


1 Ox ye 
that — = | —|, we obtain at once 
J oy l 
aa EA T oy’ | oy’ 
(41.6) e are Jan T 


which is the law of transformation of relative contravariant tensors of 
weight +1. In an entirely similar way we deduce that 


oy Caras omen oy"? ay"? y” ? 


so that ¢; ;,.--4, is a relative tensor of weight —1. 


(41.7) Be 


104 TENSOR THEORY [CHapP. 2 


From formula 40.5, 


gee trini lee ee eee 

we see that the Kronecker delta 6}::: ‘ir is obtained by multiplying 
together two e-symbols, one of which is a relative tensor of weight +1 
and the other of weight —1, and contracting with respect to a number of 
indices. The result is a tensor of weight zero, that is, an ordinary tensor. 
Thus we have proved that the generalized Kronecker deltas are absolute 


tensors. 


S óli:::] , 
Since 631:"'/«, reduces to a = 0 when the coordinate system X 
1 g? x 


is cartesian, we conclude that the covariant derivatives of generalized 
Kronecker deltas vanish identically. Thus the Kronecker deltas behave as 
constants in a covariant differentiation. 


Problems 
1. Verify that 6¥,a%° = aò — a”. 
2. Verify that 62% q% = aik — giki + ai — qitk 4 ghti — a, 
3. If a;; satisfies the equation 


ba;; + Caji = 0, 


then either b = —c and a;; is symmetric, or b = c and a;; is skew-symmetric. 
Hint: Since i and j take on values | - - - n, the equation can be written as 


baj; + ca;; = 9. 


Add and obtain (b + c)(a,;; + 4;) = 0. 

4. Show that (a) e je% = ôi, (b) e;;,¢!°° = 0%0% — 60%. Hint: The left-hand 
member in (b) is zero unless j and & are distinct and j, k is a permutation 
ofr, s. Ifj =r and k =s, then the left-hand member is +1; if j =s and 
k =r, then its value is —1. Consider now the value of the right-hand member 
for the same choices of indices. 


5. List the values of ôi, òH, 6'5/6* when the indices range from | to n. 


3 


GEOMETRY 


42. Non-Euclidean Geometries 


There is no branch of mathematics in which the tyranny of authority 
has been felt more strongly than in geometry. The traditional Euclidean 
geometry, based on a set of “self-evident truths” and created largely by 
the Alexandrian School of mathematicians (around 300 B.c.), dominated 
the thought and shaped the development of physics and astronomy for 
over 2000 years. There were a few bold souls, even among the ancient 
mathematicians, to whom “self-evident truths” contained in Euclid’s 
axioms did not seem convincing, but the prestige of logical structure of 
Euclid’s Elements was so high and the hand of authority so heavy that 
they hindered the development of mathematics for centuries. 

In 1621, Sir Henry Savile raised some questions concerning what he 
called “two blemishes” in geometry, the theory of proportion and the 
theory of parallels. Euclid’s axiom of parallels (Postulate V in the first 
book of Elements) is to the effect that any two given lines in a plane, 
when produced indefinitely, will intersect if the sum of two interior angles 
made by a transversal with these lines is less than two right angles. The 
fact that some of Euclid’s propositions, dealing essentially with the 
converse of this postulate, can be proved without invoking Postulate V 
gave hope that the postulate itself might be deduced from his other 
axioms. However, all attempts to prove the fifth postulate proved un- 
successful, and a hope that contradictions would emerge if this postulate 
were abrogated while others were retained led nowhere. In 1826, a 
Russian mathematician, Nicolai Lobachevski, presented to the mathe- 
maticians faculty of the University of Kazan a paper based on an assump- 
tion that it is possible to draw through any point in the plane two lines 
parallel to a given line. The geometry developed by Lobachevski proved 
just as devoid of inner inconsistencies as Euclidean geometry. Indeed, 
it contained the latter as a special case and implied the arbitrariness of 
the concept of length adopted in Euclidean geometry. 

105 


106 GEOMETRY [CHaP. 3 


In 1831, a Hungarian mathematician, John Bolyai, published results 
of his independent investigations which conceptually differ little from 
those of Lobachevski, but which perhaps contain a deeper appreciation 
of the metric properties of space. Bolyai pointed out, just as Lobachevski 
did, that his geometry in the small is approximately Euclidean and that 
only a physical experiment can decide whether Euclidean or non-Euclidean 
geometry should be adopted for the purposes of physical measurement. 
Thus it appears that there are no a priori reasons for preferring one 
geometry to another. However, it was only after Riemann’s profound 
dissertation on the hypotheses underlying the foundations of geometry 
appeared in print (published posthumously in 1867) that the mathe- 
matical world recognized fully the role played by the metric concepts in 
geometry. 

Riemann appears to have been unaware of the work of Lobachevski 
and Bolyai, although it was well known to Gauss. Later Beltrami pub- 
lished his classical paper on the interpretation of non-Euclidean geometries 
(1868) in which he analyzed the work of Lobachevski, Bolyai, and 
Riemann and stressed the fact that the metric properties of space are 
mere definitions. From these researches it appeared that three con- 
sistent geometries are possible on surfaces of constant curvature: 
the Lobachevskian, on a surface of constant negative curvature; the 
Riemannian, on a surface of constant positive curvature; and the 
Euclidean, on a surface of zero curvature. These geometries are also 
called hyperbolic, elliptic, and parabolic, respectively. We- consider 
them briefly in the next section. 


43. Length of Arc 


Let the n-dimensional space R be covered by a coordinate system X, 
and consider a one-dimensional subspace of R determined by 


(43.1) C. aaa?) (i= 1, m) 


where ż is a real parameter varying continuously in the interval 4, < t < ty. 
The one-dimensional manifold C is called an arc of a curve. In this book 
we deal only with those curves for which x(t) and (t) = dx'/dt are 
continuous functions in t, < t < fy. The definition of the arc of a curve 
given here is a direct generalization of the parametric representation of 
curves of elementary analytic geometry. 

Let F(z',..., a", @,..., 2"), viewed as a function of f, be a pre- 
scribed continuous function in the interval f) < t < t, We suppose 


SEC. 43] LENGTH OF ARC 107 
that! F(x, z) > 0, unless every #' = 0, and that for every positive number k 


Ge ee A) = KEE... hy Big O): 


v 


The integral 
te 

(43.2) s= | F(a, è) dt 
ty 


is called the length of C; and the space R is said to be metrized by formula 
43.2. 

Different choices of functions F(x, +) lead to different metric geometries. 
If one chooses to define the length of arc by the formula 


tz dx dx? 
43.3 =l e dt. E A 
(43.3) Sap £) aa (x, B , n) 


where g,,272° is a positive definite quadratic form in the variables 2”, 
then the resulting geometry is the Riemannian geometry, and the space R 
metrized in this way is the Riemannian n-dimensional space R,,. 


1A function F(x, t) satisfying the condition F(z, kx) = kF(x,%) for every k > 0 is 
called positively homogeneous of degree 1 in the x". This condition is both necessary and 
sufficient to ensure the independence of the value of the integral 43.2 of a particular 
mode of parametrization of C. Thus, if rin (43.1) is replaced by some function t = ¢(s), 
and we denote x‘[¢(s)] by (s) so that xi(t) = &#(s) we have the equality 


ts 82 
f F(x, ż) dt -Í F(é, £) ds, 
l 81 
where £’(s) = dri/ds and t, = ¢(s,) and f, = ¢$(s2). 

To prove this theorem, suppose that k is an arbitrary positive number, and set 
t = ks, so that f, = ks,, and tf, = ks}. Then (43.1) becomes 

C: xt(ks) = &(s) 

and 


E'i(s) = = kxz'(ks). 


dz'(ks) 
ds 


If these values are inserted in (43.2), we get 


š -Í a rilks), t(ks)]k ds, 


k8 
and if this is to equal : 


$2 

s -Í F{E&(s), &'(s)] ds, 
8y 

we must have the relation F(é, £") = F(x, kx) = kF(x, è). Conversely, if this relation 

is true for every line element of C and each k > 0, then the equality of integrals is 

assured for every choice of parameter ¢ = d(s), $ (s) > 0, sı < s <5, with 4 = $(51) 

and t, = $(s2). 


108 GEOMETRY [CHAP. 3 


We recall from Sec. 39 that, if there exists an admissible transformation 


of coordinates T: yt = y'(z!,..., x"), such that the square of the element 
of arc ds, 

(43.4) do =p, gece, 

can be reduced to the form 

(43.5) ds* = dy'‘dy', 


then the Riemannian manifold R, is said to reduce to an n-dimensional 
Euclidean manifold E,. The reference frame Y in which the element of 
arc of C in E, is given by (43.5) is called an orthogonal cartesian reference 
frame. Obviously, £,, is a generalization of the so-called Euclidean plane 
determined by the totality of pairs of real values (y!, y’). If these values 
(y', 4?) are associated with the points of the plane referred to a pair of 
orthogonal cartesian axes, then the square of the element of arc ds assumes 
the familiar form ds? = (dy1)? + (dy?)?. 

In what follows we find it convenient to represent pairs of real values 
(y', 4?) as points in a cartesian plane even when the metric of the y’- 
manifold is not Euclidean. To illustrate what is meant, consider a sphere 
S of radius a, immersed in a three-dimensional Euclidean manifold E}, 
with center at the origin (0, 0, 0) of the set of orthogonal cartesian 
axes O-X'X?X%, Let T be a plane tangent to S at (0,0, —a), and let 
the points of this plane be referred to a set of orthogonal cartesian 
axes O’-Y! Y? as shown in Fig. 8. If we draw from O(0, 0, 0) a radial 


x3 


Fig. 8 


SEC. 43] LENGTH OF ARC 109 


line OP, intersecting the sphere S at P(a', x”, x?) and the plane T at 
Oy’, y?, —a), then the points P on the lower half of the sphere S are 
in one-to-one correspondence with points (y', y?) of the tangent plane T. 

To obtain an explicit analytic form for this correspondence, we note 
that, if P(x}, x”, x?) is any point on the radial line OP, then the symmetric 
equations of this line furnish us with the ratios 


or 
(43.6) £ = dy wad, «= Àa. 


Since we are concerned with the images Q of points P lying on S, the 
variables x* satisfy the equation of S, 


(a2)? + (x2)? + (23)? = æ, 
or 
aly? + PP + d) = a. 


Solving for A and substituting in (43.6), we get 


1 2 
(43.7) x! = E 3 x = a au ’ 
VE + (y? + a? Vy)? + (yy? + a 
z? = =a" 


Ju? + YP + a 
These are the desired equations giving the analytical one-to-one corre- 
spondence of the points Q on T and points P on the portion of S under 
consideration. 

Let P,(a1, x2, x°) and P(x! + dz, 2? + dx?, x3 + dx?) be two nearby 
points on some curve C lying on S. The Euclidean distance P,P», along 
C, is given by the formula 
(43.8) dst = dudt (i = 1, 2, 3), 


and, since the variables x‘ are related to y? by (43.7), 


Gr = oe dy = A 
dy“ 
Thus (43.8) yields a formula 
Ox" Ox! 
a a 
ao = By Dy =F, — dy* dy’ 
= g,(y) dy" dy’, (a, 6 = 1,2), 


110 GEOMETRY [CHAP. 3 
where the g,,,(y) are functions of yt computed from (43.7) with the aid 


en a 
of the definition Zag = Oy" ar 
If the image K of C on T is given by the equations 
ce Yo 
l~y=¥O, StS te 


then the length of C can be computed from the integral 


a) 
S =| V gap iy? dt. 
ty 


A straightforward calculation gives 
1 
(dy) + (dy?) + Gy" dy? — y? dy’) 
: SS ———— ee 
(43.9) s l ee ray 
(P+ Slo? + O94) 


and 


p Í F ae 
WF + GDF + 3 (yy — yyy 


ae 
ty 


ie = + WI 


We see that the resulting formulas refer to a two-dimensional manifold 
determined by the variables (y', y?) in the cartesian plane T and that the 
geometry of the surface of the sphere imbedded in a three-dimensional 
Euclidean manifold can be visualized on a two-dimensional manifold R, 
with metric determined by (43.9). If the radius of S is very large. we see 
from (43.9) that the terms involving l/a? can be neglected, and the geometry 
of the surface of the sphere is then determined approximately by the 
Euclidean metric 


(43.10) ds? = (dy)? + (dy?)?. 


Thus, for large values of a, metric properties of the sphere S are indis- 
tinguishable from those of the Euclidean plane. The sum of the angles 
of a curvilinear triangle drawn on S will be nearly equal to 180°, since 
the sum of the angles of the corresponding triangle on T is 180° by 
Euclidean geometry. Because of the limitations of measuring devices it 


Sec. 43] LENGTH OF ARC 111 


may be impossible to decide a priori whether Euclidean formula 43.10 
or the more involved Riemannian formula 43.9 should be adopted as a 
basis for physical measurements. 

The chief point of this illustration is to indicate that the geometry of a 
sphere, imbedded in a Euclidean 3-space with the element of arc in the 
form 43.8, is indistinguishable from the Riemannian geometry of a two- 
dimensional manifold R, with metric 43.9. The latter manifold, although 
referred to a cartesian frame Y, is not 
Euclidean since (43.9) cannot be reduced Q 
by an admissible transformation to (43.10). 

Similarly, the geometry of Lobachevski 
can be visualized on a surface of a “pseu- 
dosphere,” a surface of constant negative 
curvature generated by revolving a tractrix, 


i = a(cos t + log tan £), 
=a Smi C 


about its asymptote. Since we will have Fig. 9 

no occasion to study the Lobachevskian 

or hyperbolic geometry, we will only indicate the main ideas leading to 
the analytical expression for the square of the element of arc 


1 
(dy')? + (dy? — a (y! dy’ — y? dy’? 
ds? = 


Te O + ("I 


which governs the study of this geometry. 

Let a circle K of radius one be drawn in the plane. The universe of 
Lobachevskian geometry consists of points interior to K. The chords PQ 
of the circle are straight lines in this geometry. (See Fig. 9.) The length 
of the segment AB of PQ is a number given by the formula 


@ i) 

ER |SS 15 

QA QB 

whereas the magnitude of the angle ABC is determined as follows. Con- 
struct a sphere S of radius one tangent to K at its center. Project AB 
and BC on S and determine the Euclidean angle between the arcs BA’ 
and B’C’ formed by the intersection of the planes passing through BC and 
BA perpendicular to the plane of K (Fig. 10). The Euclidean measure of 
A'B'C' is, by definition, the measure of the angle ABC in the Lobachevski 


112 GEOMETRY [Cuap. 3 


Fig. 10 Fig. 11 


plane: A pair of lines in the Lobachevski plane are considered parallel 
if their images on the sphere do not intersect. It can be shown that the 
points and lines of this geometry satisfy all postulates of Euclidean 
geometry except the postulate of parallels. Parallel to any given line PQ 
one can draw through a point M infinitely many lines which do not 
intersect PQ. These are the lines lying in the shaded region of Fig. 11 
and passing through M. It is not difficult to prove that the sum of the 
angles of a triangle in this geometry is less than 180°. The consistency 
of Lobachevskian geometry was investigated by Cayley, Kiein, and 
Poincaré.” 

The discussion of this chapter is confined mainly to Euclidean geometry 
and those portions of Riemannian geometry that figure in applications. 


44. Curvilinear Coordinates in E, 


The apparatus of tensor analysis was developed initially as a tool 
for the analytic study of geometries of diverse sorts. Because of its 
invariantive character, it was found particularly adaptable to the needs 
of other branches of applied mathematics. Since dynamics, mechanics 
of continuous media, and relativity lean rather heavily on geometrical 
properties of the three-dimensional space of physical experience, we devote 
most of this chapter to an investigation of properties of curves and surfaces 
imbedded in E}. 


Let the point P(y), in an Euclidean 3-space E}, be referred to a set of 


* For details on hyperbolic geometry we refer the reader to specialized treatises on the 
subject, especially to F. Klein’s Nicht-Euklidische Geometrie, 1, pp. 161-232. 


SEC. 44] CURVILINEAR COORDINATES IN £, 113 


orthogonal cartesian axes Y (Fig. 12). Consider a general functional 
transformation 


T: æt = s'y y’, y), (i= 1,2,3), P 

3 Oat 

such that the 2’ are of class C!, and J = H ~ 0 in some region R of 
f i dy? 

E,. The inverse transformation, 


T: yi =y(z!,22,23), (i =1,2,3), 


will then be single-valued, and the transformations T and T~ establish 
one-to-one correspondence between the sets of values (x1, 2”, x?) and 
(y', y2, y3). We call the triplets of numbers (21, x?, x°) the curvilinear 
coordinates of the points P in R. The reason for this terminology is the 
following: if we set xt = constant in T, then 


(44.1) ay, y?, y?) = constant 


defines a surface. If the constant is now allowed to assume different 
values, we get a one-parameter family of surfaces. Similarly, 


x*(y!, y*, y?) = constant 


and x3(y!, y?, y?) = constant define two families of surfaces. 
The condition that the Jacobian J Æ 0 in the region under consideration 
expresses the fact that the surfaces 


(44.2) a = Cis x = Co, = = C3 


intersect in one and only one point. 

We call the surfaces defined by equations 44.2 the coordinate surfaces, 
and their intersections pair-by-pair are the coordinate lines. Thus the 
line of intersection of z! = c, and x? = c is the x*-coordinate line because 


Fig. 12 


114 GEOMETRY [CHAP. 3 


along this line the variable x? is the only one that is changing. As an 
example, consider a coordinate system defined by the transformation 

yt = x! sin z? cos 2°, 

a? =! Siler Sina, 

Y = 2 cose 
The surfaces x! = constant are spheres, x? = constant are circular cones, 


and x? = constant are planes passing through the Y%-axis (Fig. 13). 
The inverse transformation in this case is given by 


f= V + yy +o, 
(yy ++ 


FN 
2 n= 


p == 
ta A 


y 
3 


2 
x£? = tan? a : 
y 
-if z > 0,0 < 2 < 7,0 < 2° < sns the ramiliarsphencalicoon 


dinate system. 
As another illustration, the transformation 


i = 2 COS ae 
YE E See 


y = 2, 


defines a cylindrical coordinate system (Fig. 14). 


Fig. 13 


Sec. 44] CURVILINEAR COORDINATES IN E, 115 


X' = const. 


Fig. 14 


Let P(y', y2, y?) and O(y! + dyt, y2 + dy?, y? + dy?) be two neighboring 
points in R. The Euclidean distance between a pair of such points is 
determined by the quadratic form 


(ds)? = (dy)? +(dy*)? + (dy*)? 


= dy’ dy’, 
. eg 
and, since dyt = — dx*, we have 
Ox* 
(44.3) ds? = g,, dx‘ dx’, 
where Jy Dy" 
y oY 
‘| a a aa = la 2 
= Ox Ox’ 2 ) 


Obviously, g,; is symmetric. Moreover, it is a tensor, since (ds)? is an 
invariant and the vector dx‘ is arbitrary. Denote by g the determinant 
ig.,l; this is positive in R since g, dz’ dx’ is a positive definite form. 
Hence we can introduce the conjugate symmetric tensor g”, defined in 
Sec. 30 by the formula g" = G"’/g, where G” is the cofactor of the element 


By in g. 
Consider now a contravariant vector A‘(x), and form the invariant 
(44.4) A = (gy A'A. 


Since in the orthogonal cartesian frame the invariant 44.4 assumes the 
form [(A1)? + (42} + (A)?]*, we see that A represents the length of the 
rector A‘. Similarly, the length of the covariant vector 4, is defined by 
the formula 

(44.5) A = (g4A,A;)*. 

In orthogonal cartesian coordinates gi" and we pe A= (A,A,)%. 


116 GEOMETRY [CHapP. 3 


A vector whose length is 1 is called a unit vector. From formula 44.3 
we see that 

dat de? 
"ds ds’ 


so that dx‘/ds = 4* is a unit vector. If x’ = yf, so that the coordinate 
system is cartesian, then dz'/ds = A}, dx?/ds = A”, dx®/ds = 4? are precisely 
the direction cosines of the displacement vector (dz', dx?, dx*). Accord- 
ingly, we take the vector 4‘ to define the direction in space relative to a 
curvilinear coordinate system X (Fig. 15). 

Consider two directions defined by the unit vectors A* and u’ at some 
point P (Fig. 16). Since the manifold under consideration is Euclidean, 
the cosine law, following from the formula of Pythagoras, gives 


L= e 


OR? = PQ? + PR? — 2PỌ PR cos 9, 
and, since A‘ and y’ are unit vectors, PO = PR = 1, and hence 
(44.6) QR? = 2(1 — cos 0). 


The components of the vector joining R with Q are 4' — u‘. Making 
use of the formula 44.4 for the length of a vector, we get 


(44.7) OR = gA — pA — pw’) 
= guh A + gimp’ — 2g; u 
=1 +1 Ai 
= Al — gu'u’). 


yray? 


Fig. 15 


Sec. 44] CURVILINEAR COORDINATES IN £; 117 


H 


Fig. 16 


It follows from (44.6) and (44.7) that the invariant ghu’ is equal to 
cos 9, and we can write 


We can use (44.8) to define the angle 6 between two directions A‘ and u’ 
if we make an unambiguous definition of sin 6. 

If A‘ and Bt are any two vectors, then from the definition of the length 
of a vector, it is clear that 

gi;A‘B’ 

V BAA V gi;B'B’ 
This leads to the formula AB cos 0 = g;;A‘B?, defining an invariant, which 
is precisely the “scalar product” A - B of elementary vector analysis. 

It follows from the expression 


ds? = g; dx‘ dx’, 


cos 0 = 


for the square of the element of arc ds between P,(z', x, x°), and 
P,(x) + dx', £? + dx, £? + dr’), that the lengths of the elements of arc 
measured along the coordinate lines of our curvilinear system X are 


(44.9) dsa = Vgu dri, dso = Vende?, dsa = Vgs dæ. 


Thus the length of the displacement vector (dx!, 0, 0) is given by Ni a dz, 
that of (0, dx?, 0) is W, Zə dx*, and the vector (0, 0, dx?) has the length 
Vgs dæ? (Fig. 17). 

In addition, from (44.8) we deduce that the cosines of the angles 6,2, 
O03, 0,3 between the coordinate lines are given by 


12 § 23 _ __ 813 
44.10) cos ĝi: = —2=—, Cos 6.4 = —- ,_ cos 445 = : 
C a / 811822 ce V 822833 y 811833 


118 GEOMETRY [CHAP. 3 


x3 


--7 P(S) 


ds) = V8 dx? 


ds = Bq. dx? yi 


Pia) 
; dsm) = V8 dx: 
x! 


Fig. 17 


For, if 2i: (dxt/dsa, 0, 0) and uig: (0, dx?/dS, 0) are two unit vectors 
directed along the X!- and X?-coordinate lines, respectively, then 


&12 dx’ dz? Reet st: 
dS (4) dso) 811822 


Since 211, 222,233 never vanish (see equation 44.9), we deduce from 
(44.10) a 

‘THEOREM. A necessary and sufficient condition that a given curvilinear 
coordinate system X be orthogonal is that g; = 0, for i Æj, at every 
point of the region R. 

From the definition of the element of volume dV in curvilinear 
coordinates, 


= ee 
COS O45 = giht = 


oy’ 


an 
= Ox? 


dx! dz? dè, 


fa) i 
where + | = is the absolute value of the Jacobian J of the transformation 
£ 


connecting the cartesian variables y' with the curvilinear x, we can 
readily deduce that 


(44.11) dV = dy' dy? dy? = \/g dx} dx? di’. 
For, 
dy | | 3y? dy” dy" 
JE eNe eS | EE A ee Sa S 
Ox? Oxi Ox? Oxi 1g; g: 


where we made use of the definition for g,; (see equation 44.3), and of 
the rule for multiplication of determinants. The determinant g is a 
relative scalar of weight 2 (cf. Sec. 28) since v g is a scalar density. 

From developments of this section we see that the metric properties 
of £3, referred to a curvilinear coordinate system X, are completely 


, 


Sec. 45] RECIPROCAL BASE SYSTEMS 119 


determined by the tensor g,;. Accordingly, this tensor is called the metric 
tensor, and the quadratic form ds? = g, dx’ dx’ is termed the fundamental 
quadratic form. 


“ 


45. Reciprocal Base Systems. Covariant and Contravariant Vectors 


In this section we interpret the main results of Sec. 44 in the language 
and notation of the elementary vector analysis introduced in Chapter 1. 
Let a cartesian system of axes (Fig. 18) be determined by a set of ortho- 
normal base vectors b,, bs, bẹ; then the position vector r of any point 
P(y, 42, y?) can be represented in the form 


(45.1) r= by’, (i = 1, 2, 3). 
Since the base vectors b, are independent of the position of the point 
P(y’, y?, y®), we deduce from (45.1) that 
(45.2) fe ree 
By definition the square of the element of arc between the points 
(y', y?, y?) and (y! + ay, y? + dy’, y? + dy) is given by the formula 
(45.3) ds* = dr o dr. 
The substitution from (45.2) in (45.3) gives 
ds? = b, + b; dy’ dy’ 

= 6; dy’ dy? 

= dy’ dy’, 
a familiar expression for the square of the element of arc in orthogonal 
cartesian coordinates. 


yY? x’ 


Fig. 18 


120 GEOMETRY [CHAP. 3 
Let a set of equations of transformation 
anf = ee hee ye y’), (i = i; 2; 3); 


define a curvilinear coordinate system X. The position vector r can now 
be regarded as a function of coordinates x’, and we write 


(45.4) dr = ae 
I Axi 
and 
Or or 
2 = dr- dr = — -— dx‘ dx’ 
ds r-dr TE 
=a dxt dx? 
where 
Or or 
45.5 „= — e.. 
(45.5) E a 


The geometrical meaning of the vector ĝr/ðx* is simple; it is a base vector 
directed tangentially to the X*-coordinate curve. We set 


or 
45.6 Any a oe 
(45.6) Some 
and rewrite (45.4) and (45.5) as 
(45.7) dr = a, dx‘ 
and 
Ei; = 4a; a; 


We observe that the base vectors a, are no longer independent of the 
coordinates (2x1, x, x), 

The use of covariant notation for the base vectors a; and b; can be 
justified by observing from (45.2) and (45.7) that 


a; dai => b, dy' 
E a 
= o 


We see that the base vectors a, transform according to the law for the 
transformation of components of covariant vectors, 


oy’ 


EA 


a; b 
since the dx’’s are arbitrary. 


The components of base vectors a;, when referred to the X-coordinate 


Sec. 45] RECIPROCAL BASE SYSTEMS 121 
system, are 
a: (a, 0, 0), az: (0, az, 0), az: (0, 0, as), 


and we note that they are not necessarily unit vectors, since, in general 
(see equation 45.5), 


Gu = a+ a, l 822 = a,° a, E l, £33 = 83+ as Æ l. 
If the curvilinear coordinate system X is orthogonal, then 
Zy = a; * a; = |a,| |a;| cos 8; = 0, if i Aj. 
This is the result stated in the theorem of Sec. 44. 


We note that any vector A can be written in the form A = k dr, where 
k is a suitable scalar. Since dr = (0r/ðx*) dx’, we have 


where A‘ = k dx‘. The numbers A’ are the contravariant components 
of the vector A, and the vectors A1a,, A2a,, A*a form the edges of the 
parallelepiped whose diagonal is A. Since the a, are not unit vectors in 
general, we see that the lengths of edges of this parallelepiped, or the 
physical components of A, are determined by the formulas 


Avg, Ag, A ges, 


SINCE gı, = A; * Ay, 822 = Ag * Ae, $33 = Ag ` As. 
Let us introduce next three noncoplanar vectors 


a, x a a, x a a, X as 
(45.8) a = 2 2 3) ——— 


E , m > om 3 

[a;a2a3] [a;a2a;3] [a,a.a3] 
where a, x a3, etc., denote the vector product? of a, and ag, and [a,a,as3] 
is the triple scalar product a, + az X az. 

It is obvious from the definitions 45.8 that a*- a, = 6,‘, and it is easily 
verified that [a,a,a3] = N! g, where g = |gij|, and that the triple scalar 
products [a'a?a?] and [a,a,a,] are reciprocally related, so that [a'a’a*] = 
1 Iv. g. Moreover, 

a’ x a° s a? x a’ aa AS 
= See ae, L eS nor o n : 

[a'a’a®] 2 [a"a?a?] (a'a?a®] 

3 We recall that a, X a, is a vector of length aja, |sin (a,, a,)|, and so oriented that 
a,,a, and a, X a, forma right-handed system. The triple scalar product (a,a,a,], on the 
other hand, is numerically equal to the volume of the parallelepiped constructed on the 


vectors aj, az a3. If {a;} is a set of base vectors in En, the reciprocal basis {a'} is 
determined by a; - a’ = 0,’. 


(45.9) a 


122 GEOMETRY [CHaP. 3 


as can be readily checked with the aid of (45.8). In view of this it is 
natural to call the system of vectors at, a’, a? the reciprocal base system. 

We observe that if the vectors a,, az, ag are unit vectors associated with 
an orthogonal cartesian system of coordinates, then the reciprocal system 
of vectors defines the same system of coordinates. 

Using the reciprocal base system, we can write the differential of a 
vector r in the form dr = a‘ dx, where the dx; are the appropriate com- 
ponents of dr. Then 


ds* = dr - dr = (a dz.) - (a dz.) 
= al.a? dz, dz; 
= g” dx, dx; 
where 


(45.10) gisat-al =p", 


It is not difficult to check that the coefficients g’’, defined by the formula 
45.10, coincide with the quantities g^ defined earlier. Thus, making use 
of formulas 45.8 and 45.9, we can readily show that g,,g’* = 0,’, and the 
solution of this system of equations for the g’* gives g’* = G’*/g, where 
G™ is the cofactor of the element g,, in the determinant |g;,|. Thus the 
definition of g” given in Sec. 44 follows as a theorem from the definition 
45.10. 

The system of base vectors determined by (45.8) can be used to represent 
an arbitrary vector A in the form A = a‘A,, where the A, are the covariant 
components of A. If we form the scalar product of the vector A,a* 
with the base vector a;, and note that the latter is directed along 
the X’/-coordinate line, we get A,a'- a; = A, ô; = A,. Thus A;/Vg,; (no 
sum on j) is the length of the orthogonal projection of the vector A on the 
tangent to the X’-coordinate curve at the point P (Fig. 19), whereas 
Ay. Es; is the length of the edge of the parallelepiped whose diagonal is 
the vector A. 

Since 

A =at = a'A,, 
we have 
a,-a,A' = a‘-a,A,, 
or 
2A‘ = O14, = A;. 


We see that the vector obtained by lowering the index in A* is precisely 
the covariant vector A,. The two sets of quantities A‘ and A, are thus 
seen to represent the same vector A referred to two different base systems. 
As has already been noted, the distinction between the covariant and 


Sec. 46] MEANING OF COVARIANT DERIVATIVES 123 


se 


Fig. 19 


contravariant components of A disappears whenever the base vectors a, 
are orthonormal. 

Similarly, if we consider the coefficients A, in a multilinear form 
A, ,a‘a’a® and require that A,,,a’a’a* = A*a,a,a,, then the set of quantities 
{A4*} represents the same tensor A when referred to the basis {a,}. All 
associated tensors (see Sec. 30) represent the same tensor A in suitable 
base systems. 


46. On the Meaning of Covariant Derivatives 


Let A be a vector localized at some point P(y', y”, y’) of Es referred to 
an orthogonal cartesian frame Y. If at every point of some region R 
about P we have a uniquely defined vector A, we refer to the totality of 
vectors A in R as a vector field. We suppose that the components of A 
are continuously differentiable functions of y‘ in R, and, if we introduce 
a curvilinear system of coordinates X by means of the transformation 


T: xt = x(y', Y’, y’), 
the corresponding components A‘(x) will be continuously differentiable 
functions of the point (2, x, x°) determined by the position vector 
r(x!, x?, x). In the notation of Sec. 45, the base vectors in the X-reference 
frame are a, = Or/dz‘, so that A has the representation 
(46.1) A = A‘a,. 


We will be concerned with the calculation of the vector change AA 
in A as the point P(z', x?, x?) assumes a different position 


P'(a! + Axl, a? + Ax? 28 + Az). 


124 GEOMETRY [CHAP. 3 
From (46.1) we have 

= (A? + AA‘\(a, + Aa,) — A'a; 

= AA'‘a, + A’ Aa; + (AA’*)(Aa,). 


As in ordinary calculus we denote the principal part of the change by dA, 
and write 


(46.2) dA =a, dA‘ + A‘ da,. 


This formula states that the differential change in A arises from two 
sources: 


(a) Change in the components 4* as the values (x1, x®, x?) are changed. 
(b) Change in the base vectors a, as the position of the point (x1, x®, x?) 
is altered. 


The partial derivative of A with respect to x’ is defined as the limit of 
the quotient, 


AA OA 
m 5 
asio Ar? Ar 


and it follows from the expression for the increment AA that 


(46.3) SL, a q E e 


We show next that the vector defined by formula 46.3 is identical with 
the covariant derivative of the vector A*. First we establish the identity 


(46.4) oni T (la 
We recall that g,; = a;-a,;. Hence 


Ogi; Ôa; da, 
aan oak! tage 


Permuting the indices in this formula, we get 


OB: _ Oa Oa, 
Vee Se s T 
Dg _ da; p 


GEN 
r? P ar? 


Sec. 46] MEANING OF COVARIANT DERIVATIVES 


If we assume that T is of class C?, then‘ 


aa, _ 2a, 
Oxi = Ox c 
We form 
x t 
and obtain 
0a; N 
(46.5) Ta a [ij, k]. 
It follows from (46.5) that 
0a; y 
aE = [ij, k]a”. 
Hence 
da; a ag k a 
rr ae = [ij, kja"-a 
= ihe 
a 
ij) 
from which it follows that 
da; | 
ae ij) 


This establishes the identity 46.4. 
Inserting this result in (46.3), we get 


0A” a) .. 
= | — A’ |a,, 
i 4 (e) js 


and the expression in the bracket is precisely 4%. Thus 


(46.6) oe Atay, 


125 


It follows from (46.6) that the covariant derivative 4% of the vector A* 
is a vector whose components are precisely the components of dA/dx? 


referred to the base system aj. 


or pate or _ 2 or _ Oa; 
For, a; = =, an a ol Bat aaa ant pee 


126 GEOMETRY [CHAP. 3 


We can also show that if A is represented in the form 


(46.7) A = 4,3", 
then 

0A P 
(46.8) aa = A, ,a°. 


From a‘ +a, = 6; we have 


da’ 


i, a; 
pose ase 
Therefore 
dat = ai O85 
Ore * ax* 
a af 
jk 


by (46.4). Since af.» a, = 6%, the foregoing result is epuivalent to 
aoa 
Oxk ’ jk} 

Hence 


(46.9) a. —| la 
ox" jk 
The differentiation of (46.7) with respect to x* and the substitution from 
(46.9) lead at once to (46.8). 
We observe that, if the Christoffel symbols vanish identically in R, 
the reference frame associated with these symbols is cartesian (see Theorem 
I, Sec. 39), and, in this case, the base vectors a, are independent of the 


A oA? 
coordinates x’. The formula 46.3 then states that — = — åa; 
At Oxi Oxi 


Ox) ` 


and 


hence 4’, = 


47. Intrinsic Differentiation 
Let a vector field A(x) be defined in some region of E}, and let 
C: =r), Gailh 


be a curve in that region. The vectors A(x), defined over the one-dimen- 
sional manifold C, depend on the parameter f, and if A(z) is a differentiable 
vector and the x‘(t) belong to the class C!, then 


Sec. 47] INTRINSIC DIFFERENTIATION 127 
By virtue of (46.6) this can be written 


a f 
2 Ee a t) Ai aa 
dt ij} dt 
The vector 6A%/dt, defined by the formula 


ôA" _ dA” arca 
ij 


w di dt 


is called the absolute or intrinsic derivative of A* with respect to the 
parameter t. 

Following McConnell® we will make free use of intrinsic differentiation 
in the treatment of geometry of curves and surfaces. 

If the vector field A* is defined in the neighborhood of C, as well as 
on C, we can write 


(47.1) 


; (a= 22.5), 


ôA _ At dx? 
ot id 
and it follows that the familiar rules for differentiation of sums, products, 
etc., remain valid for the process of intrinsic differentiation. If A is a 
scalar, then, obviously, 64/6t = dA/dt. 
The extension of the process of intrinsic differentiation to tensors of 
rank greater than one is immediate. Thus we write 


‘Nay SE — [olan [eas 


jp) = dt \kp 


Air _ dAly, | 
ôt dt ap) * dt 
ô T 
We observe that, since Ss = 0, the fundamental tensors g,; and g*’ can 
be taken outside the sign of intrinsic differentiation. 


Problems 

d saan _ 6A) 

1. Prove that g Eua) = 2g;jA* or 
0A; 0A; 
ex) axt’ 


2. Show that — Á; = 
j 


ôA’ 
3. Show that — ETTA iBi) Ea ar et tgus -r 
4. If A, ae, show that A; p = 2i,4°,- 


5 Compare A. J. McConnell, Absolute Differential Calculus, pp. 156-162. 


128 GEOMETRY [CHAP. 3 


2 P 2 ; 
5. Show that apt Sut 'B’) oa A, B a A'B, y 


6. Prove that if A is the magnitude of A‘, then A, = A, ,A‘/A. 
7. If y* are rectangular cartesian coordinates, show that in £3 
dy? ay? y ay? ax? 
(xf, y] = 2 Pa and Í = 
ax* dxP ax? xB axt xË dy? 


These formulas are often found to be more convenient for the computation of 
Christoffel’s symbols than the defining formulas 31.1 and 31.2. 


48. Parallel Vector Fields 


Consider a curve (Fig. 20), 
C: stanky), 4<1<4, @=1,2,3), 


drawn in some region of E3, and a vector A localized at some point P of C. 
We suppose that the functions x‘(t) are of class C’. If we construct at 
every point of C a vector equal to A in magnitude and parallel to it in 
direction, we obtain what is known as a parallel field of vectors along the 
curve C. We will deduce a set of necessary and sufficient conditions for 
a vector field to be parallel. 

If A is a parallel field along C, then the vectors A do not change along 
the curve and we can write dA/dt = 0. It follows, upon noting (47.1), 
that the components 4’ of A satisfy a set of simultaneous differential 
equations 6A‘/dt = 0, or, when written out in full, 


i A B 
a ee 


48.1 
aa dt ap dt 


= 0. 


Fig. 20 


Sec. 48] PARALLEL VECTOR FIELDS {29 


We can show, conversely, that every solution of the system 48.1 yields 
a parallel vector field along C. Indeed, from the theory of differential 
equations it is known that this system of three first-order differential 
equations has a unique solution when the values of the components A 
are specified at a given point of C. But it was shown previously that the 
vector field formed by constructing a family of vectors of fixed lengths, 
parallel to a given vector, satisfies the system. Hence every solution of 
equation 48.1 satisfying the initial conditions must form a parallel field 
along C. 

Let A‘(t) and B‘(t) be any two solutions of the system 48.1. We verify 
that the lengths of vectors A‘ and B* indeed do not change as we move 
along the curve. Moreover, the angle 0 between the vectors A* and 
B! remains fixed as the parameter ¢ is allowed to change. To prove 
this we note that (Sec. 44) A - B = AB cos 0 = g,;A‘B’, and, if g,;4*B? 


d ~~ mer. 
is to remain constant along C, then F (A B = 0. But g,,4*B’ is an 


invariant, and, since the g, behave like constants in the process of co- 
variant differentiation, we can write 


d ae, a 
© (¢.,A‘B*) = — (g,,A'B? 
FAC ) 5 E” ) 


_ 7 SA’ pig g Ai OB 
= ij ra B?’ + g,;A oe 

Since, by hypothesis, the fields A‘ and Bi satisfy (48.1), 6A‘/6t = 0 and 
6B‘/ét = 0, and we conclude that g,,4‘B’ is constant along C. It follows 
directly from this result that, if A? = B’, then g,,4‘A’ = A? is constant 
along C, and this implies that 0 = constant. 

The notion of a parallel vector field along a curve can be extended to 
define parallel vector fields over three-dimensional Euclidean manifolds. 
Thus consider any point P(x) and a vector A localized at P. If we construct 
at every point of the manifold a vector equal to A in magnitude and 
parallel to it in direction, there will result a parallel vector field in the 
space of three dimensions. If a curve C is drawn passing through P, 
the vectors A’ of the field lying on C will form a parallel field along C, 
and will thus satisfy (48.1). However, since vectors A‘ are defined at 
every point (x*) of the manifold, we can write 

dA‘ _ OA‘ de® 
dt Ox" dt’ 
so that equations 48.1 assume the form 


a . k 
E + | : Ie) = 0. 
Ox® lak dt 


130 GEOMETRY [CHaP. 3 


This must be true for all curves passing through P, that is, for all values 
of dx"/dt. Accordingly, the parallel vector field in E, satisfies the system 
of equations 


aA‘ | i | | 
+ A* = 0, or A‘, = 0. 
au* lak) 


The converse follows, as previously, from the existence and uniqueness of 
solutions of such systems of differential equations. 
The condition for a parallel displacement of a covariant vector A; is 
ze 
Dx” j 


This follows from the observation that A; » = g; 4% whenever A; = g,;A’. 


Ja, = or A, , = 0. 


49. Geometry of Space Curves 
Let the parametric equations of the curve C in E, be 
C: 2) =a) i StS (=a 23), 
The square of the length of an element of C is given by 
(49.1) ds? = g; dx‘ dx’, 
and the length of arc s of C is defined by the integral 


ta dx dxi 
(49.2 Ss = Cc T A 
) aA SO a 
From (49.1) we see that 
dx dz’ 
(49.3) 1 = g; ——, 
Si ds ds 
and, if we set dz'/ds = 4', equation 49.3 can be written as 
(49.4) = e A = 1. 


Thus the vector A, with components 4', is a unit vector. Moreover, A is 
tangent to C, since its components 4’, when the curve C is referred to a 
rectangular cartesian reference frame Y, become A‘ = dy'/ds. These are 
precisely the direction cosines of the tangent vector to the curve C. We 
shall assume throughout this discussion that the curve C is of class C2, 
so that it has a continuously turning tangent at all points of C. 

Consider a pair of unit vectors A and u (with components 4* and uż, 
respectively) at any point P of C (Fig. 21). We suppose that A is tangent 


SEc. 49] GEOMETRY OF SPACE CURVES 131 


X+ dr 


Fig. 21 


to C at P. The cosine of the angle 6 between A and y is given by the 
formula 


(49.5) Cos e A 
and, if A and p are orthogonal, (49.5) requires that 
(49.6) gijA'w) = 0. 


Any vector p satisfying equation 49.6 is said to be normal to C at P. 

If we take the intrinsic derivative with respect to the arc parameter s 
of the quadratic relation 49.4, and recall that the g,,’s behave in covariant 
differentiation like constants, we obtain 


OA a4 OA? 23 
37 A + 8; — 2 = 9. 
Zij Ss Bij 3s 
Since g,; is symmetric, the foregoing result can be written in the form 
OAs Ai i 
Ba g 0. We see that the vector TE either vanishes or is normal 
s S 


to C, and if it does not vanish we denote the unit vector codirectional 


ôA 
with FH by u’, and write 


(49.7) i =——, 


where x > 0 is so chosen as to make yz’ a unit vector. 

The vector w’, determined by the formula 49.7, is called the principal 
normal vector to the curve C at the point P, and x is the curvature of C 
at the point in question. 

The plane determined by the tangent vector A and the principal normal 
vector y is called the osculating plane to the curve C at P. 


182 GEOMETRY [CHARS 


Since w is a unit vector, 


(49.8) gin’ = 1, 

and we can treat this quadratic relation just as we did g,4°4 = 1 and 
ôu’ i i ôu’ 

deduce the orthogonality of vectors E and u’; that is, g,,° a = 0. 


Moreover, differentiating intrinsically the orthogonality relation 49.6, 
we get 


bit, bn 
ifs apo i i= 3 
di ds i Si ô 
or 
yA TE 
A: — = ij — 7 
Bij ôs 835 ôs had 
= Hg fi 
= — 965 


where we used equation 49.7 and the quadratic relation 49.8. Thus 
j 
(49.9) Py 
és 


and, since g,,A‘A? = 1, we can write (49.9) in the form 
j 
gi (2 + zi) = 0, 
Os 


ôu’ l ; 
which shows that the vector a + xå’ is orthogonal to 4*. This result 
s 


shows that if we define a unit vector v, with components »’, by the for- 
mula 


(49.10) yi = 2 (2 + xi), 
T\ és 


the vector v will be orthogonal to both A and u. We agree to choose the 
sign of 7 in such a way that 


(49.11) Vg e:phiniv® = 1, 


so that the triad of unit vectors A, u, v forms, at each point P of C, a 
right-handed system of axes.® 


€ We deduce from (41.2), and from the definition of the triple scalar product (Sec. 45), 
that 


A pw yt i 
eihun = |A u P| = —A-wxv. 
A w vp Vg 


SEC. 49] GEOMETRY OF SPACE CURVES 133 


dy’ 2 
it follows that €; = Vg ex 1s an absolute tensor, and hence the left-hand 
member of (49.11) is an invariant. An algorithm of división suggests 
that v* in (49.11) is determined by the formula 


Since e;;, is a relative tensor of weight —1 (Sec. 41), and g = 


(49.12) ve = ehi; 
where 4; and u; are the associated vectors g;,A* and g,,u%, and 


ijk 


ct jk e 


a 
VE 
is an absolute tensor. The validity of this expression follows from an 
observation that (49.12) satisfies the conditions of orthogonality g,,A*v’ = 
0, g,;te'v? = 0, and the equation 49.11 determining the orientation of the 
unit vector v relative to A and u. The number 7 appearing in equation 
49.10 is called the torsion of C at P, and the vector v is the binormal. 

In order to reconcile these definitions with the usual definitions of the 
principal normal and curvature given in elementary vector analysis, we 
recall the formula 46.6, 0A/Ox? = A*a, and note that if the vector field 


PUSAT 


A is defined along C, we can write 


OA det _ qa d 
Ox’ ds a Coe 
: al eee emer) Vile Adui ; 
Using the definition of intrinsic derivative, —— = A*.— , we can write 
: Os ab 
the preceding result as 
dA. 0A: 

49.13 — = — a. 
( ) ds Os 


Let r be the position vector of the point P on C; then the tangent 
vector A is determined by 


and (49.13) gives for the curvature vector 


dr dà 62% 
a a = ane, 
iy) ds ds ôs 


where ¢ is a vector perpendicular’ to A. 


7SinceA-A=1,A-dA/ds = 0. 


134 GEOMETRY [CHap. 3 


With each point P of C we can associate a constant x, such that c/x = p 
is a unit vector. We can now rewrite (49.14) in the form 


= pa, 
where, in the last step, we have made use of the formula 49.7. 


50. Serret-Frenet Formulas 


This section contains a set of three remarkable formulas, generally 
known as Frenet’s formulas, which characterize, in the small, all essential 
geometric properties of space curves. Two of these formulas have already 
been derived in Sec. 49. They are 


(50.1) — = xp", x > 0, 
Os 
and 
(50.2) u ryt — zì. 
ôS 


The first of these gives the rate of turning of the tangent vector À as the 
point moves along the curve, and the second that of the principal normal 
œ. The third formula, 


(50.3) Y = ri, 


to be derived next, specifies the rate of turning of the binormal as the 
point P moves along the curve. 
If we differentiate equation 


[49.12] a = AR, 


intrinsically, we get 


ôy“ ROA a Oa 
(50.4) L m eft y, p efit, Fi 
l ôs Te 
since the covariant derivatives of e*’* are zero.’ Lowering the indices in 
(50.1) and (50.2) we get g T and n = Tv; — x/;; and inserting 


8 : ik? . a ij ` . 
For the e*s are constants in a cartesian system, hence «#¥ = 0, and this is a tensor 
equation! 


Sec. 50] SERRET-FRENET FORMULAS 135 
these values in (50.4) yields 


k 
Y iih bc 
ia Faus + eA (rv; — xh) 
== —r", 
since NAA; = «yu; = 0, because the e'* are skew-symmetric, and 
u = ee*h; This establishes (50.3). 
Formulas 50.1, 50.2, and 50.3, when written out explicitly in terms of 
Christoffel’s symbols, assume the forms 


di i. yeas ; din: pide’ ax ; 
meal) ae (kas ds 
a cs k 
(50.5) = + P (u . = —(x/' — 71’), 
dv | i ! paar : 
== p —— = —TH. 
ds j ds 


Except for position of the curve C in space, the system 50.5 determines 
the curve uniquely when continuous functions x(s) and t(s) are specified 
along C. 

We conclude this section by considering an example illustrating the use 
of Frenet’s formulas. Consider a curve, defined in cylindrical coordinates 


by equations 


= a 
x? = (s), 
a = A 


This curve is a circle of radius a. The square of the element of arc in 
cylindrical coordinates is 

ds? = (dx)? + (o(d) + (dx°}, 
so that gy, = 1, 200 = (T), 833 = ie = 0 ts 7 7, and it is easy to 
verify that the nonvanishing Christoffel symbols are 


Ome tal Gl 
= f, = = — 
22 12 21) xt 
The components of the tangent vector à to the circle C are 4t = dx'/ds, 


so that A! = 0, 22 = d6/ds, 22 = 0. Since A is a unit vector, gad? = 1 
at all points of C, and this requires that 


136 GEOMETRY [CHAP. 3 
Therefore (d0/ds)? = 1/a?, and the first formula in (50.5) yields 


2, o 
i Roe jk) ds \22f" ds a’ 
2 k 1 
ye = Ë oe U — 0, 
ds \jk) ds  \21) ds 

3 
a Pezo. 
ds j ds 


Since p is a unit vector, gu'u’ = 1, and it follows that x = 1/a and 
w= —1, uv? = 0, we = 0. 
An entirely analogous calculation shows that 7 = OQanda = 0 44— 0, 
v= I. 
Problems 
1. Find the curvature and torsion at any point of the circular helix C whose 
equations in cylindrical coordinates are 


CC? a =c, e e eo 


Show that the tangent vector 4 at every point of C makes a constant angle with 
the direction of the X°-axis. Consider C also in the form y! = a cos 9, y? = 
asin 0, y? = k0, where the coordinates yt are rectangular cartesian. 

2. Show that 


Ont dx ; ; 

T2 = T u? + x(rv? — xå’) 

ôu? dr dx 
TU EE E 2 PA a ee Ai 
ôs? ds” Ge nae ds ’ 
ori (ved i) ae, 

Tse = T(x. a O FE Hu. 


3. Using results of Problem 1, show that the ratio of the curvature x to the 
torsion 7 is a constant. Show from Frenet’s formulas that whenever 7/x = 
constant, and the coordinates are cartesian, r? = cd‘ + bt, where c and b’ are 
constants. From this result it follows that 4ĉb? = constant, so that the curve 
makes a constant angle with the lines whose direction ratios are bê. In other 
words, the curve is a cylindrical helix. This theorem is due to Bertrand. 

4. When C is specified in the form 


C: yt =y%(s), 


where the y* are orthogonal cartesian coordinates and s is the arc parameter, 
show that 


x = UVP + UYT + (oP 


SEc. 51] EQUATIONS OF A STRAIGHT LINE 137 
and 
OD OPY (yy 
mie =D Gy Ti, “ 
GP) P)" 


5. Write equations of Problem 2 in cartesian coordinates y and show that 
when 7 = 0 and x = constant along C, the equations of C are 


y? = At cos xs + Bisin xs + C$, 


where AA? = Bi Bt = 1/x?, AB? = 0. Thus C is a circle. 
6. Let C be a cylindrical helix determined by 
y? = ẹ(0), 
C: (4? = (0), 
y? = ko, k = constant, 


where c is the arc parameter of the directrix curve C’ in the yty”-plane, so that 
(ds)? = (dy)? + (dy?)?. Note that (ds)? = (1 + k?)(do)? and show that 


¢ wy k 
p” y” 
7 p” ye 0 1 
Tm y i Er k2)8 , 
dv" — v's" 
on = 
i+k ’ 


and verify that r/x = k. 


51. Equations of a Straight Line 


Let A‘ be a vector field defined along a curve C in E3, where C is given 
parametrically as 


Car) Ss oar CS 


s being the arc parameter. If the vector field A’ is parallel, then it follows 
from Sec. 48 that 6A‘/ds = 0, or 

i ; p 
(51.1) as oe | i ne z 

ds ap 

We shall make use of equation 51.1 to obtain the equations of a straight 
line in general curvilinear coordinates. The characteristic property of 
straight lines is that the tangent vector A to a straight line is directed along 


138 GEOMETRY [CHAP. 3 


the straight line, so that the totality of tangent vectors A forms a parallel 
vector field. Thus the field of tangent vector A‘ = dx*/ds must satisfy 
(51.1), and we have 


The equation 


(51.2) 


ds ds 


ds? ap 


is the equation sought. In cartesian coordinates the Christoffel symbols 
vanish and we obtain the familiar form of differential equations of straight 
lines. From the geometric interpretation of the curvature x as a measure 
of the rate of turning of the tangent line to a curve, we are led to define 
the curvature of a straight line to be zero. This definition is consistent 
with the first of Frenet’s formulas 50.1. 


52. Curvilinear Coordinates on a Surface 


In the remainder of this chapter we will study the properties of surfaces 
imbedded in a three-dimensional Euclidean space. It will be shown that 
certain of these properties can be phrased independently of the space 
in which the surface is immersed and that they are concerned solely with 
the structure of the differential quadratic form for the element of arc of a 
curve drawn on the surface. All such properties of surfaces are termed 
the intrinsic properties, and the geometry based on the study of this 
differential quadratic form is called the intrinsic geometry of the surface. 

We find it convenient to refer the space in which the surface is imbedded 
to a set of orthogonal cartesian axes Y, and regard the locus of points 
satisfying the equation 


(52.1) EG, fy) =0 


as an analytical definition of a surface S. We suppose that only two of 
the variables y* in (52.1) are independent and that the specification of, 
say, y' and y? in some region of the Y!Y?-plane determines uniquely 
a real number y? such that the left-hand member in (52.1) reduces to zero. 
If we suppose that F(y', y?, y°), regarded as a function of three independent 
variables, is of class C! in some region R about the point Po(y', Yo, Yo) 


wal 
with F #Æ 0 and F(Y, Yo”, Yo) = 0, then the fundamental theorem 
Po 


on implicit functions guarantees the existence of a unique solution 
y? = fa, y’), such that Yo° = f(y, Yo). 


Sec. 52] CURVILINEAR COORDINATES ON A SURFACE 139 


The definition of the surface by means of a single equation 52.1 is less 
convenient than the one introduced by Gauss, who defined the surface 
as a locus of points satisfying three equations of the type 


(52.2) y' = y'(u', wu’), 


where u,! < ul < u,! and u,? < uv? < u,”, and the y’ are real functions 
of class C? in the region of definition of the independent parameters 
u’, u?. In order to reconcile these two different definitions we shall require 
that the functions y‘(u!, u?) be such that the Jacobian matrix 


ðu! ðu! du 
(52.3) 

dy" dy" dy" 

ðu? ou? u? 


be of rank two, so that not all the determinants of the second order 
selected from this matrix vanish identically in the region of definition of 
parameters u’. This requirement ensures that it is possible to solve two 
equations in (52.2) for uw! and u? in terms of some pair of variables y’, 
and the substitution of these solutions in the remaining equation leads to 
an equation of the form y? = y*(y', y”). It should be remarked that, 
if any two determinants formed from the matrix 52.3 vanish identically, 
then the third one also vanishes, provided that the surface S is nota 
plane parallel to one of the coordinate planes. 

Since u! and uv? are independent variables, the locus defined by equations 
52.2 is two-dimensional, and these equations give the coordinates y* of a 
point on the surface when u and «°? are assigned particular values. This 
point of view leads one to consider the surface as a two-dimensional 
manifold S imbedded in a three-dimensional enveloping space E. We 
can also study surfaces without reference to the surrounding space, and 
consider parameters u' and u? as coordinates of points in the surface. 
A familiar example of this is the use of the latitude and longitude as 
coordinates of points on the surface of the earth. 

If we assign to u! in (52.2) some fixed value u! = c (Fig. 22) we obtain 
as a locus the one-dimensional manifold 


y? = y*(c, u), (= 1,2, 3), 


which is a curve lying on the surface S defined by equations 52.2. We 
shall call this curve the w2-curve. Similarly, setting u? = constant in (52.2) 
defines the u!-curve, along which only u* varies. By assigning to u* and 
u2 a succession of fixed values, we obtain a net of curves, on the surface, 


140 GEOMETRY [CHAP. 3 


Fig. 22 


which are termed the coordinate curves. The intersection of a pair of 
coordinate curves obtained by setting u! = u,', U? = uọ determines a 
point Py. The variables u, u* determining the point P on S are called 
the curvilinear, or Gaussian, coordinates on the surface. 

Obviously the parametric representation of a surface in the form 52.2 
is not unique, and there are infinitely many curvilinear coordinate systems 
which can be used to locate points on a given surface S. Thus, if one 
introduces a transformation 


Un Ua. a), 


52.4 
(29) u? = u*(ia', 7°), 


where the u”(ū!, u?) are of class C! and are such that the Jacobian 
gu u? ane r R l 
J= T does not vanish in some region of the variables ii‘, then one 
Tie 
can insert the values from (52.4) in (52.2) and obtain a different set of 
parametric equations, 


(52.5) y=f'@, 2), (i= 1,2, 3), 


defining the same surface S. Equations 52.4 can be looked upon as 
representing a transformation of coordinates in the surface precisely in the 
same way as equations r’ = x*(¥!, x’, 3), (i = 1, 2,3) were viewed as 
defining a transformation of coordinates in Eş. 


53. Intrinsic Geometry. First Fundamental Quadratic Form. 
Metric Tensor 


We remarked in Sec. 52 that the properties of surfaces that can be 
described without reference to the space in which the surface is imbedded 
are termed intrinsic properties. A study of intrinsic properties is made 


SEC. 53] INTRINSIC GEOMETRY 141 


to depend on a certain quadratic differential form describing the metric 
character of the surface. We proceed to derive this quadratic form. 

It will be convenient to adopt certain conventions concerning the 
meaning of the indices to be used in this and remaining sections of this 
chapter. We will be dealing with two distinct sets of variables: those 
referring to the space E, in which the surface is imbedded (these are 
three in number), and with two curvilinear coordinates u! and uv? referring 
to the two-dimensional manifold S. In order not to confuse these sets 
of variables we shall use Latin letters for the indices referring to the space 
variables and Greek letters for the surface variables. Thus Latin indices 
will assume values 1, 2, 3, and Greek indices will have the range of values 
1,2. A transformation T of space coordinates from one system X to 
another system XY will be written as 


Wa ei as la a 


a transformation of Gaussian surface coordinates, such as described by 
equations 52.4, will be denoted by 


TE = f (in. rea 5 
A repeated Greek index in any term denotes the summation from 1 to 2; 
a repeated Latin index represents the sum from | to 3. Unless a statement 
to the contrary is made, we shall suppose that all functions appearing 
in the discussion of the remainder of this chapter are of class C? in the 


regions of their definition. 
Consider a surface S defined by 


(53.1) y‘ = y*(u*, u’), 

where the y‘ are the orthogonal cartesian coordinates covering the space 
E, in which the surface S is imbedded, and a curve C on S defined by 
(53.2) u = u(i), Sich, 


where the u*’s are the Gaussian coordinates covering S. Viewed from 
the surrounding space, the curve defined by (53.2) is a curve in a three- 
dimensional Euclidean space and its element of arc is given by the formula 


(53.3) ds? = dy‘ dy’. 
From (53.1) we have 
. Oy 
. dy = — du” 
(53.4) y ya 


where, as is clear from (53.2), 


e du 


dt. 
t 


142 GEOMETRY [CuaP. 3 
Substituting from (53.4) in (53.3), we get 


du* ou? 
= ap du“ du”, 
where 
_ Oy’ dy’ 
CID da= 
The expression for ds*, namely, 
(53.6) ds = adn duce 


is the square of the linear element of C lying on the surface S, and the 
right-hand member of (53.6) is called the first fundamental quadratic form 
of the surface. The length of arc of the curve defined by (53.2) is given 
by the formula 


tg =—_— 
S =| Va, pti? dt, 
ti 


where u* = du*/dt. Since in a nontrivial case ds? > 0, it follows at once 
from (53.6) upon setting uv? = constant and ut = constant in turn, that 
ds? = au(du)? and dsj) = az (du). Thus a,, and a» are positive 
functions of u! and u?. 

Consider a transformation of surface coordinates 


(53.7) u” = u*(u', u’), 


x 


Ou 
Ou" |` 


with a nonvanishing Jacobian J = It follows from (53.7) that 


u 
du* = — di’ 
ou’ 
and hence (53.6) yields 
Ou? ou? 
ds =) du’ dÑ? 
f on’ ane a 
If we set 
R Ou" du? 
a T 


we see that the set of quantities a,, represents a symmetric covariant 
tensor of rank two with respect to the admissible transformations 53.7 
of surface coordinates. The fact that the a, are components of a tensor 
is also evident from (53.6), since ds? is an invariant and the quantities 
a.g are symmetric. The tensor a,, is called the covariant metric tensor 
of the surface. 


Sec. 53] INTRINSIC GEOMETRY 143 
Since the form 53.6 is positive definite, the determinant 


Air Aye 
a = 


> 0, 


As, A22 


and we can define the reciprocal tensor a% (see Sec. 30) by the formula 
a*°a,,, = 6%. Thus we have 


ag 2 2 me a 
ql = —22 | a}? = a”! = 12 dae Su 


a a a 


The contravariant tensor a®’ is called the contravariant metric tensor. 

We can repeat, almost verbatim, the contents of Sec. 44 concerning 
metric properties of our two-dimensional space S. Thus the direction of a 
linear element in the surface can be specified either by the direction 
cosines dy'*/ds, (i = 1, 2, 3), or by the direction parameters 


du” 
53.8 2 = 
(53.8) rp 
For, 
dy’ _ dy’ du" 
ds ĝu” ds 


and the du%/ds are uniquely determined when the direction cosines dy'*/ds 
are specified, and conversely. We define the length of the surface vector 
A*, that is, the vector determined by A‘(u', u’) and A®(u', u?), by the 
formula? 

A = Va, 5A°A?. 
It follows from (53.6) that 
du* duf 
= aag — — 

ds ds 

= Ag" e 
so that the direction parameters 2% are components of a unit vector. 
The covariant vector 


(53.9) Ap = ugh" 


is sometimes called the direction moment, and it is clear from (53.9) that 


atig = a aggh* = 6,70" = X, 


and that 
(53.10) Ady = gp APA? 
® The components A‘ of the vector A*, as viewed from the enveloping space Es, 


= oy? Soe ee oy? oy? 
are given by 4‘ = a5 A" and it is clear that A‘A* = a T A%AP = azgA*A?, 


144 ‘ GEOMETRY [CHAP. 3 


54. Angle between Two Intersecting Curves in a Surface. 
Element of Surface Area 


The equations of a curve C drawn on the surface S can be written in 


the form 
: C2 =a SIS 


Since the u”(t) are assumed to be of class C?, the curve C has a continuously 
turning tangent. Let C, and C, be two such curves intersecting at the 
point P of S (Fig. 23). We take the equations of S, referred to orthogonal 
cartesian axes Y, in the form 


(54.1) y? = y (u, u), 

and denote the direction cosines of the tangent lines to C, and C, at P 
by & and 7’, respectively. The cosine of the angle 6 between C, and C,, 
calculated by a geometer in the enveloping space E}, is 


(54.2) ` COS =e Wr, 
However, 
pa Ou! duu" _ day 
OU" ds Ga 
i Oy" dou? _ dey’ 
ĝu? ds, ds,” 


where the subscripts 1 and 2 refer to the elements of arc of C, and C,, 
respectively. Using the definition 53.8, we can write the unit vectors in 
the directions of the tangents to C, and C, as 

dius E du“ 

ds,’ ds,’ 


Fig. 23 


Sec. 54] ELEMENT OF SURFACE AREA 
and 


, ONE . y’ 
(54.3) f= 15, ‘= — pu’. 
du” 1 = g" 
Inserting in (54.2) the expressions from (54.3), we get 


cos 0 = pitas 
du? du? 
and since 
= dy’ oy’ 

Ju” 3u” 
the foregoing expression can be written 
(54.4) COSO = a,,A*u?. 
If the curves C, and C, are orthogonal, 
(54.5) G = 0. 


145 


In particular, if the surface vectors A* and p? are taken along the coor- 
dinate curves (A! = 1//ay,, 22 = 0, u? = 0, u? = 1/V az), then it follows 
from (54.5) that the coordinate curves will form an orthogonal net if, and 


only if, ay, = 0 at every point of the surface. 


We can give a pictorial interpretation of these results in the manner 
of Sec. 45. Thus, if r denotes the position vector of any point P on the 
surface S, and the b, are the unit vectors directed along the orthogonal 
coordinate axes Y, then equations 53.1 of the surface S can be written in 


vector form (see Fig. 24) as 


r(ui, u’) = b;y (ut, u?). 


Vis or 


Fig. 24 


146 GEOMETRY [CHAP. 3 


It follows from this representation of S that 


Or or 
pi f E ee a B 
ds“ = dr- dr = Aue Bul du” du 
= Ay, du“ du’, 
where 
or or 
54.6) - ee oe 
ee) ee Du” du? 


Setting 0r/du* = a,, where a, and a, are obviously tangent vectors to the 
coordinate curves, we see that 
ay, = a,° a, Ayn = a, ° ap, Age = Ag ° ay. 


In the notation of (54.3) the space components of a, and a, are £* and yf, 


respectively. 
We can define an element of area do of the surface S by the formula 


do = |a; X a| du du’, 
and it is readily verified that the right-hand member of this expression 
can be written 
(54.7) do = JU -= ae du du? 
= Va du’ du’. 
This formula has precisely the same structure as the expression 44.11 
for the volume element. 


It follows from Sec. 40 that the skew-symmetric e-systems, in a two- 
dimensional manifold, can be defined by the formulas 


C1, = lz = eU = e”? = 0, e@ = —e41 = e,, = —e,, = 1, 
and, since these systems are relative tensors (see Sec. 41), the expressions 


a 1 
Eg = yae- and e = — e” 
a 
are absolute tensors. Using the «-symbols, we can write the sine of the 


angle 0 between two unit vectors A*, u* in the form 
€p uf = sin 0, 


which is numerically equal to the area of the parallelogram constructed 
on the unit vectors 4* and u*. It follows from this result that a necessary 
and sufficient condition for the orthogonality of two surface unit-vectors 2% 
and p“ is |€,,A%u"| = 1. 


Sec. 55] FUNDAMENTAL CONCEPTS OF CALCULUS 147 


Problems 


1. Show that the cosine of the angle 0 between the coordinate curves u and u? 
on S is cos 0 = ayo] V ay;499. 

2. Find the element of area of the surface of the sphere of radius r if the 
equations of the surface are given in the form: 


yi = r sin u! cos u?, y? =r sin utsin u?, y? = r cos ul, 


where the yf are orthogonal cartesian coordinates. (Note that in this case 
au =r", Ga = 0, dz = rè sin? u.) 


55. Fundamental Concepts of Calculus of Variations 


The most celebrated problem of intrinsic geometry of surfaces is 
concerned with the determination of curves of shortest length joining 
two specified points on the surface. This is the problem of geodesics. 
This problem has such profound implications on the formulation of the 
fundamental principles of optics, dynamics, and mechanics of deformable 
media that it is desirable to treat it in greater generality than one would 
-if concerned solely with the geometry of surfaces imbedded in Æ. To 
do this we shall draw on certain concepts in the calculus of variations. 
Since we will be concerned with the study of extremal properties of 
integrals, we shall recall some salient facts about the problem of relative 
maxima and minima of functions of several independent variables. 

Be ae «0.0.0 2") beia continuous function of n independent variables 
xi defined in a bounded, closed region R. We are interested in determining 
a point P(x) of R at which f attains an extreme value in comparison with 
the values of f in a certain neighborhood of the point P(x). There is no 
doubt about the existence of maximum or minimum of f since it is known 
that every function continuous in a bounded closed region attains its maximum 
and minimum values either in the interior or on the boundary of the region." 
Moreover, if 

Ela aata) 

is a differentiable function, then at interior points of the region, where 
the function attains its extreme values, @f/ðx* = 0, (i = 1, 2,44. M1): Mite 
vanishing of the derivatives of f(z!, x?,..., x”) obviously is not a sufficient 
condition for an extremum. We will call the points of the region R at 
which ðf/ðx* vanish simultaneously the stationary points of f(x},..., 2”). 
The determination of stationary points is studied in advanced calculus, 
and we assume that this subject is quite familiar to the reader. 

Calculus of variations is also concerned with the determination of 
extreme or stationary values of certain expressions, but there is an 


10 This theorem is due to Weierstrass. 


148 GEOMETRY [CHaAP. 3 


important distinction in that in calculus of variations we deal with extremes 
of functionals rather than functions of a finite number of variables. By 
a functional we understand a function depending on the changes of one 
or several functions, which assume the roles of the arguments. As an 
example of a functional consider the formula 


5 =["vi + (dy/dx)* dz, 


defining the length of a plane curve y = y(x) joining the points whose 
abscissas are x, and z,. Here the value of s depends on the behavior of 
the functional argument y(x), and the class of functions y(x) on which the 
functional s depends is in some measure arbitrary. Thus, one might 
consider the problem of determining the extremes of s when y(x) is an 
arbitrary continuous function with a piecewise continuous first derivative. 

In the study of extremes of continuous functions f(z',..., x"), of a 
finite number of independent variables x‘, we must specify the region 
R within which fis defined, whereas in the study of extremes of functionals 
we must characterize the class of admissible functional arguments. For 
example, we may demand that the functional arguments possess certain 
properties of continuity, or behave in some specified fashion at the end 
points of the interval, and so on. We will be concerned with relative 
extremes of functionals, that is, extremes relative to a certain “neighbor- 
hood” of functional arguments for which the functional takes on an 
extreme value, just as we were with relative maxima and minima of 
functions. In order to make the notion of the neighborhood of a function 
precise, we introduce a 

DEFINITION. A function g(x", x®,..., x”) belongs to the h-neighborhood 
of the function f(z',..., 2x"), provided that |f— gl <h, h > 0, for all 
values of the independent variables x', x?,..., x” in the interior of R. 

With the aid of this definition, we can formulate the fundamental 
problem of the calculus of variations as follows: Find, within the class 
of admissible functional arguments, those functions f that yield extreme 
values for the functional under consideration in comparison with the values 
given the functional by functions belonging to some h-neighborhood of f. 

A word concerning the difficulties inherent in this problem is in order. 
We have already remarked that in the theory of maxima and minima 
of continuous functions of several independent variables the existence of 
extreme values is guaranteed by the theorem of Weierstrass. In the 
problem of calculus of variations, on the other hand, it may happen that 
the problem is formulated without internal inconsistencies, and yet it 
has no solution because of the limitations imposed on the class of ad- 
missible functional arguments. For example, let it be required to join 


Sec. 56] EULER’S EQUATION IN THE SIMPLEST CASE 149 


two given points on the X-axis by the shortest curve with continuous 
curvature so that the curve is orthogonal to the X-axis at the end points. 
This problem has no solution because the length of every admissible 
curve is always greater than the length of the straight line foining the 
given points. We can always find a curve of admissible type whose length 
differs from the length of the straight line by as little as desired so that 
there exists a lower bound of the functional, but this lower bound is not 
the minimum attained for any curve of the class of curves under con- 
sideration. It follows from this example that in each variational problem 
we are confronted with the question of the existence of a solution of 
the problem. 

In order to deduce the differential equations furnishing a set of necessary 
conditions for an extremum of a functional, we need the following 
fundamental lemma of calculus of variations. 


t, 
If the integral T E(t)M(t) dt, where M(t) is a continuous function of 
ty 
t in the interval t, < t < tz, vanishes for every choice of the function &(t) 
of class C” in t, St < ty, and which is such that E(t) = €(t.) = 0, then 
M(t) is identically zero in the interval tı < t < tz 
We shall prove the lemma by assuming that M(t) # 0 and reaching a 
contradiction. Assume M(t) #0 at some point t of fı < t < t, and 
suppose that M(t’) > 0. Since M(t) is continuous, there exists a number 
6>0 such that M(t) > 0 in the interval (t’ — ô, t + 6). Define a 
function &(t) as follows: 


j= 0, ing st<7, wheres, = t — 6, 

&(t)=0, intz<t<t, where m, = t + 6, 

n= t—7)"*t—7,)", inn StSts 
The function é(¢) is surely of class C” in (t, to) and &(t,) = E(t) = 0. 
For this function, however, 


te 72 
Í E(t)M(t) dt =| E(t)M(t) dt > 0, 
ty (oi 
since the integrand is always positive in 7, < £ < 72, Thus we reach a 
contradiction, and hence our assumption that M(t) # 0 is not tenable. 
56. Euler’s Equation in the Simplest Case 


The simplest problem of the calculus of variations is concerned with 
the determination of extremals of a functional 


(56.1) Oe i "F(t, x, è) dt, 


150 GEOMETRY [CHaP. 3 


where F(t, x, 2) is a prescribed real function of its real arguments t, 2, 
and # = dz/dt. We shall suppose that F(t, x, x) is of class C?, in some 
region R of the plane (a, t), for all values of 7. Concerning the class 
of admissible functions x(t), we suppose that the values x(7,) and x(t.) are 
prescribed in advance and that z(t) is also of class C? int) < t < tə. 

Our problem is to find a function 


x = f(t), h IEL h, 


called an extremal for the integral 56.1, such that J(x) for x = f(t) assumes 
an extreme value in comparison with the values given to J by the admissible 
functions in a sufficiently small h-neighborhood of the function x = f(t). 
In other words, admissible functions x(t) are such that |z(t) — f(t)| < h 
for 4 <t<t,. We shall deduce next a necessary condition for an 
extremum of J. Consider a function &(t) of class C*, such that (t) = 
&(t.) = 0, and form a set of functions 


E(t) = x(t) + élt) = x + ôx, 


where e is an arbitrary numerical parameter near zero. The functions 
z(t) clearly assume the same values at the end points of the interval 
(t,, t2) as x(t). We shall call the &(t) the varied functions, and the quantity 
e(t) = dx the variation of the function x = f(t). For sufficiently small 
values of e all varied functions z(t) will be contained in the A-neighborhood 
of the extremal x = f(t). Consequently, the integral 56.1, 


J(Z) = J(z + <£) = (6), 


considered as a function of «e, will have an extreme value fore = 0. A 
necessary condition that this be so is D’(0) = 0. 

Because of the restrictions imposed on functions under consideration, 
the integral 


P(e) =| "Fe x + e$, è + cE) dt 
tı 


can be differentiated under the integral sign, and we obtain as a necessary 
condition for an extremum the equation 


(56.2) '(0) = l SEE + Fad) dt = 0, 


11 These restrictions are more severe than necessary, but we have in mind certain 
geometrical problems in which the continuity of second derivatives is a desirable 
property. 


Sec. 56] EULER’S EQUATION IN THE SIMPLEST CASE 151 


which must be true for every ¿(t) satisfying the conditions laid down in 
the definition of &(r). Integrating (56.2) by parts, we get 


ta tg te dF; 
(56.3) Í F,&(t) dt + F.8(| -| &(t) — dt = 0, 
th ty i dt 
and, since &(t,) = &(t) = 0, the foregoing equation simplifies to 
tg A 
(56.4) f HD| Fe = dE AR 
t dt 
Since &(t) satisfies the restrictions imposed on &(f) in the lemma of Sec. 


55, we deduce from (56.4) that a necessary condition for an extremum 
of (56.1) is that x(t) satisfy the differential equation 


(56.5) ie = Zo. 
t 
Expanding (56.5) we obtain 
d? 
(56.6) Fas TS + Fas + Fu— Fe = 0, 


where the subscripts denote the derivatives of F(t, x, £) with t, x, and ¢ 
regarded as the independent variables. In order to determine x(t) we 
must solve this ordinary differential equation subject to the end conditions 
z(t) = 2, and z(t) = z} Equations 56.5 and 56.6 were first deduced 
by Euler and are called Euler’s equations. 

The expression (see equation 56.2) 


eb'(0) = € l "EOF: + E()Fel dt, 


which is akin to the differential of the function P(e) evaluated at « = 0, 
is called the first variation of the integral J, and is denoted by the symbol 
6J. Thus 

ôJ = «®'(0). 
Taking into account the left-hand member of equation 56.3 and the 
definition of ôJ, we can write 


ta 
(56.7) ôJ = [F+ ðr]; +Í (F= - 4 F,) ôx dt, 
ty 


where 6x = e&(t). Since the right-hand member of (56.7) vanishes when 


a(t) is an extremal, we can state a 
THEorEM. A necessary condition for an extremum of the functional J(x) 


is the vanishing of its first variation. 


152 GEOMETRY [CuapP. 3 


57. Euler’s Equations for a Functional of Several Arguments 


Consider next the case of a functional J depending on several functional 
arguments x‘, (i = 1, 2,..., n), where™ 


t 
GD Jj =| "F(t, oo ot eee gee ae 
ti 


As in Sec. 56 we assume that F is a real function of class C? in the 2n + 1- 
dimensional space of the real variables £, z',..., 2", $1, ..., 7. 

We suppose that there exists a set of functions 
(57.2) ATO a s S CSa i 


whose values at the end points of the interval are known, and which are 
such that (57.1) assumes an extreme value in comparison with the values 
given to J by a class of admissible functions belonging to the h-neighbor- 
hood of (57.2). We introduce n arbitrary functions ° = &*(t),4, < t < tz 
of class C? which vanish for tf = t, and ¢t = t, and construct a family 
of admissible functions 


(57.3) # = x(t) + €€(1), 


where the parameter e is so chosen as to make the varied paths 57.3 lie 
in the h-neighborhood of the curve 57.2. 
As in Sec. 56 we form the function 


(57.4) 
tg . : 
P(e) =| F(t, z + €€',..., #7 ek" a ef). Ree a 
ty 
which, by hypothesis, has an extremum for e = 0; hence, 


d® 


SS 
a: de 


It follows that 


tz 
GITO oJ = Í [(Fné + Fan’) + +++ 4 (Foné" + Frë”™)]dt = 0, 


0. 


e=0 


and the integration by parts gives 


tz 
Heee + Fang” 


ti ti 


aa d i d 
f a(n e 4 re) a] a0 
t dt tı dt 


* To ensure the independence of the integral 57.1 of special modes of parametriza- 


tion, we suppose that F(t,x,z) is positively homogeneous of degree one in the ż' (See 
Sec. 43). 


t2 


ôJ = e| Fat! 


Sec. 57] EULER’S EQUATIONS 153 


Since the ¿* are arbitrary and vanish at the end points of the interval, 
we conclude from the fundamental lemma that 


(7.7) Fy — £ Fy = ih (ee 1 N). - 


or 
F — £9 F 5453 — a Bays = Fit = 0. 


This set of n ordinary differential equations of second order is called 
the Euler equations for the variational problem associated with the 
functional 57.1. Thus, to obtain the set of functions 57.2, we must 
determine the solution of the system 57.7 satisfying the end conditions 


(57.8) si=f(t), mi =filt), (=1,2,...;n). 


The problem discussed in this section appears to be entirely analogous 
to the simpler one treated in Sec. 56, but there is a distinction in that the 
vanishing of the first variation of (57.1) is a necessary condition not 
only for an extremum but also for a mixed maximum and minimum, the 
so-called minimax. An integral J(x},..., x”) may attain a maximum 
when the function x(t) is varied and a minimum in the course of the 
variation of x%(t). The saddle point of a hyperbolic paraboloid, studied 
in the elementary theory of maxima and minima, is a simple illustration 
of this circumstance. We will call the solutions of Euler’s equations 57.7 
satisfying the end conditions 57.8 the extremals of the functional J. This 
term will be used regardless of the nature of the stationary value assumed 
by the functional J, be it a maximum, minimum, or neither. 

In our derivation of Euler’s equations 57.7, we assumed that the vari- 
ables x‘ are independent. When the x‘ are constrained by a set of k < n 
functional relations of the form 


$(t,01,2%,...,0°)=0, (j=1,2,...,4), 


the set of appropriate Euler’s equations can be deduced by considering 
the free extremum of a certain new functional introduced by Lagrange. 

To clarify the essential differences in the problems of free and constrained 
extrema, consider the functional 


t 
(57.9) Je Í "F(t, x, 2, a}, 2?) dt, 
ty 
in which the variables are constrained by the relation 
(57.10) A(t, x1, x?) = 0. 
We suppose that the extremal, 
af = a'(t), ty < t < to, (i = l, 2), 


154 GEOMETRY [CHAP. 3 


satisfies the end condition of the type 57.8. When the constraining 
condition 57.10 is written in the form 


(57.11) (x, y, 2) = 0 
by setting x = t, y = x1, z = 2”, equation 57.11 can be thought to repre- 
sent a surface referred to a set of cartesian ryz-axes. The extremal must 


lie on this surface and we suppose that (57.11) can be solved for z in the 
neighborhood of the extremal to yield a differentiable function 


(57.12) z = f(x, y). 
The substitution from (57.12) in (57.9) then yields an integral of the form 


(57.13) ne Í "F(x, y, y’) de, 


in which the variables x and y are independent, and we can obtain the 
Euler equation by minimizing (57.13) on a set of admissible paths that 
satisfy the end conditions y(x,) = Y1, Y(£2) = Yə. This is the problem of 
the free extremum already considered in Sec. 56. 

However, such reduction of the problem of constrained extremum to 
the problem of a free extremum of the functional 57.13 is usually incon- 
venient because an explicit solution 57.12 of equation 57.11 may prove 
unwieldy. In this event we can follow a procedure similar to that of the 
Lagrange multiplier method of obtaining the relative extreme values of 
functions of several variables constrained by relations of the type 57.11. 

We suppose that df/dz =Æ 0, so that it is theoretically possible to obtain 
the solution of (57.11). If this is so, (57.9) can be rewritten in the form 


(57.14) J =| E odad + Sey) aa, 


since 2’ = f, + fy’. 
The integrand in (57.14) is a function of x, y, and y’, which we denote by 


(57.15) Fa, yy) = F, Y, Y', fi fe + fy’) 
Thus the Euler equation associated with the integral 57.14 is 
Ze 
(57.16) Cees OFA 
Oy  dxdy’ 
On noting (57.15), we see that 
OF 
ðy E Fy ar F.f, A AGE + fy’); 


OF 
ay? fa + Fate 


Sec. 57] EULER’S EQUATIONS 155 
so that (57.16) yields 


: dF, dF, 7 
fae Ff PC P =f — Fa t fay = 0, 
dx dx 
or ý 
dF dF; 
H tt) —H =o 
wth dx dx 
Thus 
dFy _ F, 
(57.17) Ji = E 
D 
dx 


since f, is assumed to be defined along the extremal. 
On the other hand, the differentiation of (57.11) yields 


dy + bf, = 9, 
by 


$- 


The expressions for f,, given by (57.18) and (57.17) along the extremal, 
represent the same function of x; hence 


so that 


(57.18) h= 


x 
(57.19) ad a = A(x), 
by $: 


where A(x) denotes the common value of the ratios.” 
It follows from (57.19) that the necessary conditions for the extremum 


of the integral 57.9 are 


a L [F, + A(2)4y] = 0, 
(57.20) l 
ae — [F, + Ax)4,] = 0. 


When we revert to the original notation by setting x = t, Y = Ta 
z = 2, we obtain a pair of equations 


(57.21) = eae Gi = 1,2), 


az 


13 We note that if both ¢, = 0 and ¢, = 0, equation 57.11 does not define a surface. 


156 GEOMETRY [CHAP. 3 


which have the structure of Euler’s equations 57.7 for the variational 
problem associated with the free extremum of the integral 


Í ree ON 2)] de. 


Similar considerations apply to the problem of minimizing the integral 
57.1 in which the n variables x? are constrained by a set of k < n relations 
(57.22) Dil, Bh, wag BY) Oe OS ooon le 

If the matrix 0¢,/0x* is of rank k, the system of differential equations 
for the extremal is 
(37-23) sd G,i = 0, 

dt 
where 
G=F+A,(H¢(t,.2), G=1,...,k). 

We note in conclusion that the constraining relations 57.22 do not 
involve the derivatives #*. Such constraints are called holonomic to 
distinguish them from constraints of the form 


(57.24) p(t, x, è) = 0 


which are nonholonomic. Nonholonomic constraints arise in the study 
of dissipative dynamical systems, and we shall encounter them in Chapter 
4. It is clear from the foregoing discussion that equations corresponding 
to (57.23), when nonholonomic constraints are present, must involve 
not only the multipliers A,(t) but also the derivatives A,(r). 


Problems 


1. Consider the variational problems in (56.1) with Euler's equation 56.6. 
Note that if F(t, x, 2) does not contain x explicitly then (56.5) yields at once the 
first integral in the form F; = constant. Show that when F(t, x, 2) does not 
contain ¢ explicitly then the first integral of (56.6) is F — xF, = constant. 


d 
Hint. Let F = F(x, 2), compute pi (F — #F;) and make use of (56.6). 


2. Note the hint in Problem 1 and show that the Euler equation 56.5 can be 
written in the form 


d ; 


3. The Euler equation 57.16 is an identity whenever F (x, y, y’) = M(x,y) + 
N(x, yy’, with M, = Nz. In this case (57.13) becomes 


2a 
J= | Mdr:+Ndy 
em 
and this integral is independent of the path. Thus, every curve joining the given 
endpoints is an extremal for (57.13). 


SEC. 58] GEODESICS IN R, 157 


58. Geodesics in R, 


We are now in a position to discuss the problem of finding curves of 
minimum length joining a pair of given points on the surface. We will 
carry out our calculation for the case of the n-dimensional Riemannian 
manifolds, since our results will be of interest not only in connection 
with the geometry of surfaces but also in the study of dynamical trajec- 
tories in Chapter 4. 

Let metric properties of the n-dimensional manifold R,, be determined 
by 


(58.1) dse = g dai dxi, Gc linea), 


where g,; = gj; are specified functions of the variables x’. We suppose 
that the form 58.1 is positive definite and the functions g,; are of class 
C2. The length of a curve C, represented in R, by equations 


C 2 =x"); ALLS i 
is given by 
———— 
(58.2) s -Í V gapit dt, CA) 
7 
The extremals of the functional 58.2 will be termed geodesics in R,. 
The function F of Sec. 57 in this case is 
(58.3) F = V gyt, 
and, to form Euler’s equations 57.7, we need to compute Fa and Fy. This 


computation is straightforward. We deduce from (58.3) that 


piy Sap ae, 
7 


1 o 
ho SS = (Be 
ee) (Sap Ox 
and 
ey ee Y 
Fy = (Zap? 20) a 


Substituting these expressions in Euler’s equations yields 


Ogag era 
d | A | Cx. 
58.4 Lj _&aj" | _ ————— = 0. 
( ) dt V Bap tta? Iep a? 


Since ds/dt = J £.p2*%", equation 58.4 can be written in the form 


CET, ilah 
4 (ga) eer, 
dt\ds/dt 2ds/dt i 


158 GEOMETRY [CHaP. 3 


and, carrying out the indicated differentiation, we obtain 
a 72 2 
ga sid = Oga; gre = 1 O8ap PAEA 3 Bait d s/dt ` 
i ax? 20x ds/dt 
Since the second term in this equation can be written as a sum of two 
terms, we have 
a 2 2 
ajar =j 1 (ĉea ae O88; a oe a Z Eat“ d s/dt 9 
Oxf Ox GEH ds/dt 


lee; Can) “Cen ‘ 
But [«f, j] e ie ae = a), so that the foregoing equation 


assumes the form 

ed sjat 

8.5 Seis ai? = gi 

(58.5) + [xp j]2 Bai ld 
These are the desired equations of geodesics. If we choose the parameter 

t to be the arc length s of the curve, that is, if we set 


oS = E aur =a 
7 He 

the system 58.5 simplifies to read 

(58.6) Bui" + [aß, Jae? = 0. 


In equation 58.6, dots the differentiation with respect to mine arc 
parameter s. 

If we multiply equation 58.6 by the tensor g” and sum, we obtain a 
simple form of the equations of geodesics in R,. 


(58.7) e+ Jarat 0, e 

B (x, ie. an k 
We observe that the form of these equations is identical with equations 
51.2 defining the straight line in £3. Since (58.7) is an ordinary second- 
order differential equation it possesses a unique solution when the values 
x‘(s) and the first derivatives dx‘/ds are prescribed arbitrarily at a given 
point x*(s9). 

If we regard a given surface S as a Riemannian two-dimensional 
manifold Rj, covered by Gaussian coordinates u*, then (58.7) assumes 
the form 

d*u’ du” du’ 
58.8 E? ja — =0, a, P, y = 1,2). 
( ) 2 B ds ds ( B Me ) 
Hence at each point of S there exists a unique geodesic with an arbitrarily 
prescribed direction 4* = du*/ds. It is not difficult to prove that, if there 


SEC. 58] GEODESICS IN R, 159 


Fig. 25 


exists a unique solution u*(s), passing through two given points on S, 
then the curve u%(s) is the curve of shortest length joining these points." 

If the manifold R,, is Euclidean, a coordinate system exists in which 
the Christoffel symbols vanish. In this case equations 58.7 become 
dx‘/ds? = 0. The general solution of this equation is x‘ = A's + B’. 
Thus the geodesics in E,, are straight lines. 

As another illustration, consider the problem of determining geodesics 
on an arbitrary cylinder immersed in Ez. We choose the Y*-axis parallel 
to the generators of the cylinder and let the trace of the cylinder on the 
Y? Y?-plane be given by equations 


y = ġo), 
y? = (0), 
where ø is the arc length of C. (See Fig. 25.) Since 
(do)? = (dy) + (dy’y’, 
an element of arc ds of the geodesic is given by 
(ds)? = (do) + (dy*)’, 


so that a,, = dog = 1, dy. = dq, = 0. Hence equations 58.8 reduce to 


d?o 

ast = 0, for ye SS il 
and 

d*y* 

—— = 0, fony = 

ds? ts 


14 See, for example, L. P. Eisenhart, Differential Geometry (1940), p. 175. 


160 GEOMETRY [CHaP. 3 


We thus obtain 
o = As + B, 
y = As + B,. 
If A Æ 0, we can write these equations in the form 
JE ms Co + Ca, 


where C, and C, are arbitrary constants. 
The equations of the geodesics are therefore 


y = ¢(0), 
y? = y(0), 
y? = Cio Sr Ca, 


and hence the curve is a helix, whose pitch is determined by C,. The 
constant C, determines the origin for the arc parameter o. 

We can show in a similar way that the geodesics on the surface of a 
sphere are arcs of great circles. (See Problem 1.) However, as an illustra- 
tion of the use of equations 57.23 we consider the functional 


t2 
(58.9) ‘=| Jsa'dt, (i= 1,2,3), 
ty 


representing the length of arc of a curve C in cartesian coordinates. If 


C is to lie on a sphere of radius a, the constraining relation 57.10 has the 
form l 


(58.10) p= ro a= 


when the center of the sphere is at the origin. The function G = F + Ad, 
introduced in Sec. 57, is 


(58.11) G = Viti — Mirz! — a?) 
and equations 57.23 take the form 
(58.12) CELSO OT eo ea eee 


dt ds 
where ds = Vxia* dt. 
On eliminating A(t) from the first two equations in the set 58.12, we get 


or 


SEc. 58] GEODESICS IN R, 161 
Therefore 


1 per 
(58.13) paces ae 
ds ds 


v 


where C; is a constant of integration. 


Similarly, the elimination of A(t) from the last two equations in (58.12) 
yields on integration 


1 3 
(58.14) pet pd _¢ 


where C, is a constant. 
From (58.13) and (58.14) we find 


Bl Sell ph Ae 2 gyl L a 
Cae ee E x dx 
CDi Gy 
or 
3\ 2 
(58.15) c.4(5) = c.a(5) 
x L 


and the integration of equation 58.15 gives 
(58.16) Cia coe Cr - Cr = 0, 


which represents a plane through the origin. Equation 58.16 together 
with the constraining relation 58.10 shows that the solution of the system 
58.12 is an arc of a great circle. 


Problems 


1. In an orthogonal cartesian frame Y, the sphere of radius a is determined 
by equations 


y! = a cos u! cos u’, 
y? = a cos usin u’, 


y? = asin uw, 
In this case ds? = a? (du!)? + a? (cos u’)? (du)? and 
yt Rona EA 
s = af V1 + cos? w(u?) dut, 
ug! 


where ù? = du?/du!. Show that the geodesics are great circles. 
2. Find the geodesics on the surface 


yt = ut cosu, y? = t sin 4’, g> =ü 


imbedded in E3. The coordinates y’ are orthogonal cartesian. 


162 GEOMETRY [Cuap. 3 


3. Show that, if we set Q = a, paul where u* = du*/ds, the equations of the 
geodesics 58.8 in R, can be written 


30` 0Q 
E: 
ds ou” ou” 


; P . uy 
Hence the solutions of these equations for ü” should yield — | 


ap 


seen from (58.8). This suggests a different means for computing the symbols 


ù*uĉ, as can be 


(2, in any particular coordinate system. Use this method to calculate the 
a 


Christoffel symbols for the coordinate system in Problem 1 by determining the 
coefficients of 4%? in the solutions for the second derivatives of u” with respect 
to s. 


59. Geodesic Coordinates 


We have seen (Sec. 39) that, if a Riemannian space R, is Euclidean, 
a coordinate system exists in which the components g, of the metric 
tensor are constants throughout the space. This implies that in such a 
coordinate system 0g,,/0x* = 0. The vanishing of these partial derivatives 
is equivalent to the vanishing of all Christoffel symbols, since’® [ij, k] = 


ð 0 de., 
(ges Bix 4 Er _ a If R, is not Euclidean, then the Christoffel 


ri rå dat 
symbols do not vanish at all points of R,, but it is possible to find a 
coordinate system, in fact infinitely many, in which they vanish at any 
given point P of R,. Such coordinates are called geodesic for that par- 
ticular point, or locally cartesian at P. 

Thus consider some surface net with coordinates u* and consider the 
point P(u!, uo”) on S. If v* are the coordinates of some other net on S, 
then 


(59.1) w= (= 12). 
The second derivative formula 32.5 yields the relation 
2, a B y a 
(59.2) a n He |= du’ du’ | v E i 
dv* dv" By} av? at Au) dv” 


However, if there exists a transformation of coordinates 59.1 such that the 
. v . 

Christoffel symbols | A vanish at P, then for that particular point 

E 

0v? ðv" By) ðv ðw ` 

15 See also Theorem I, Sec. 39. 


(59.3) 


Sec. 60] PARALLEL VECTOR FIELDS IN A SURFACE 163 


We exhibit next a solution of this equation yielding a particular trans- 
formation 59.1 to a coordinate system v* in which the Christoffel symbols 
vanish at P. It is the second-degree polynomial 


(59.4) u" = u p* + v — 3 ) vv", 
2 law P 
where the up* is the value of u* at P and the | y are the values of the 
HIP 
Christoffel symbols at P. To verify that (59.4) satisfies (59.3), we compute 
ĝu” (A 
(59.5) E = £ 
Ov" j lau = 
and 
2,,% 
(59.6) cae -|* 
ðv? dv" Au) e 
From (59.4) we see that the point P, in new coordinates, is given by 


v* = 0, and hence at the point P, equation 59.5 yields a oe OF. 
Inserting values from this equation and (59.6) in (59.3), we see that it is 
satisfied at P. Hence the new variables indeed are geodesic coordinates 
aie. . 

We conclude this section by a remark that there is an extension of this 
result by Fermi, who proved that in every Riemannian manifold R, there 
exists a coordinate system such that the coordinates are geodesic at all 


points of an arbitrarily prescribed analytic curve." 


60. Parallel Vector Fields in a Surface 


The concept of parallel vector fields along a curve imbedded in E, 
(Sec. 48) was generalized by Levi-Civita to curves imbedded in n-dimen- 
sional Riemannian manifolds. As an illustration of the usefulness of the 
concept, consider a surface S immersed in E, and a curve C on S. We 
take equations of C in the form 


C: uv = ud), iststa 
and suppose that the metric properties of S are governed by the tensor 
a,- If A* is a surface vector field defined along C, we can calculate the 


surface intrinsic derivative 
a « ay 
6A = 14 a 
ôt dt By dt 


16 A derivation of explicit equations of transformation for this case, which include 
(59.4) as a special case, was given by Levi-Civita in a paper entitled “Sur Vécart 
géodésique,” Mathematische Annalen, 97 (1926-27), pp. 291-320. 


(60.1) 


164 GEOMETRY [CHaP. 3 


This is identical in form with the left-hand member of equation 48.1 
defining the parallel vector field along a space curve. Accordingly, we 
take the differential equation 

dA*, { a) ,,du’ 

dt \By) dt 

which determines a unique vector field when the components of the vector 
are specified at an arbitrary point of C as the definition of the parallel 
vector field along a curve C on the surface S. If the parameter f is chosen 
as the arc length s, equation 60.2 reads 


(60.2) 


dA” fa) gdu’ 
60.3 +; A” — = 0, 
a ds \by} ds 
and if A* is taken to be the unit tangent vector to C so that 

At = me 
ds 

with a,,4*A° = 1, then (60.3) yields 

2,0 B y 
(60.4) e+ fa) E AE =o 

ds By) ds ds 


This equation is recognized as the equation of a geodesic on S, and hence 
one can enunciate a 

THEOREM. The vector obtained by the parallel propagation of the tangent 
vector to a geodesic always remains tangent to the geodesic. 

From uniqueness of solution of (60.4) it follows that the property of 
tangency of a parallel vector field to a surface curve is both a necessary 
and sufficient condition for a geodesic. 

In the Euclidean plane geodesics are straight lines, and the parallel 
vector field formed by the tangents to a straight line traces out the same 
straight line. On the surface of the sphere the geodesic is an arc of a 
great circle joining two given points on the sphere, and the corresponding 
vector field is the field of tangents to the geodesic. From the last example 
it is clear that parallelism with respect to a surface curve differs from the 
parallelism with respect to a space curve imbedded in E}, since vectors 
obtained by a parallel propagation, along the surface curve C, need not 
be parallel in the Euclidean sense. However, it is easy to prove that the 
lengths of vectors forming a parallel field with respect to C remain 
constant. Indeed, word-for-word repetition of the proof given in Sec. 48 
leads to the conclusion that the angle between two vectors propagated 
in parallel fashion remains unchanged, and it follows, as it did in Sec. 48, 
that the vectors forming a parallel field are constant in magnitude. A 


SEC. 61] ISOMETRIC SURFACES 165 


corollary of this result is that the vector field obtained by a parallel 
propagation of a surface vector along a geodesic makes equal angles 
with the geodesic. 

It should be noted that the concept of parallelism in Riemannian 
manifolds is defined relative to a given curve. A surface vector A%, 
specified at a point P of S, when propagated in parallel manner along a 
given curve C to a point Q, need not coincide with the vector obtained 
by the parallel propagation along a different path joining P and Q. More- 
over, if a closed curve C, enclosing a simply connected region of S, is 
drawn, and a parallel vector field is constructed starting with some point P 
on C, then the vector obtained by traversing the closed path need not 
coincide with the initial vector. The angle between the initial and final 
vector measures another intrinsic property of S, known as the Gaussian 


curvature of S. This property is introduced in a somewhat different way!’ 
in Sec. 62. 


61. Isometric Surfaces 


The properties of surfaces with which we have been concerned so far 
hinged entirely on the study of the first fundamental quadratic form 


(61.1) I = ag dua. 


These properties constitute a body of what is known as the intrinsic 
geometry of surfaces. They take no account of the distinguishing charac- 
teristics of surfaces as they might appear to an observer located in the 
surrounding space. Two surfaces, a cylinder and a cone, for example, 
appear to be entirely different when viewed from the enveloping space, 
and yet their intrinsic geometries are completely indistinguishable since 
metric properties of cylinders and cones can be described by the identical 
expressions for the square of the element of arc. If a coordinate system 
exists on each of the two surfaces such that the linear elements on them 
are characterized by the same metric coefficients a,,, the surfaces are 
called isometric. Obviously the surfaces of the cylinder and cone are 
isometric with the Euclidean plane, since these surfaces can be rolled out, 
or developed, on the plane without changing the lengths of arc elements, 
and hence without altering the measurements of angles and areas. 

In the following section we introduce an important scalar invariant, 
known as the Gaussian curvature, which will enable us to determine the 
circumstances under which a given surface is developable, that is, isometric 
with the Euclidean plane. 


17 Compare L. P. Eisenhart, Introduction to Differential Geometry, Princeton 
University Press, p. 200. 


166 GEOMETRY [CHaAP. 3 


As an illustration of an isometric nondevelopable surface consider the 
catenoid 
Si: y! = v" cost’, 
y? = v' sin v?, 


3 _, 0) 
y = acosh —, 
a 


obtained by revolving the catenary y? = cosh (y?/a) about the Y°-axis. 
We will show that the surface S, is isometric with the surface of the 
helicoid defined by 
Se: ot = cosa. 

su) Sin 

y= au’, 
The first fundamental form ds? = dy’ dy’ for S, is easily found to be 
(61.2) ds* = aga ae* = oe - (dr? + (W? (dary, 


(v — 
so that 


(v5? 
(v)? Er à 
For the surface S,, we find 
(61.3) ds? = ay, du du’ = (du) + [a° + (u’)*] (du*)’, 
so that 


am ay. = 0, as, = (v') = =a°+ (Cas =a a 


ay, = |, a = 0, az = @ + (u')?. 
Now, if we set in (61.2) 
(v — a = (u£, v= 12, 
we obtain 
ds? = (du')* + [(u')? + a?}(du?)?. 


Since this is identical with (61.3), the surfaces S} and S, are isometric. 
It follows from discussion in the next section that these surfaces are not 
developable. 


62. The Riemann-Christoffel Tensor and the Gaussian Curvature 


The formulas of Sec. 37 describing the properties of Riemann-Christoffel 
tensors in n-dimensional manifolds simplify considerably when n is set 
equal to 2. Thus, if we are given the first fundamental form, 


(62.1). ds” = a,, du“ du’, 


SEC. 62] THE GAUSSIAN CURVATURE 167 


of the surface S, we can form the Christoffel symbols with respect to this 
surface, and the corresponding Riemann tensor 


peo ee) eon 
Rapa = | OW du? |+ A a , 
[8y a] [86, a] [xy, A] [xð, å] 


We recall that this tensor is skew-symmetric in the first two and last 
two indices, so that, for the surface S, 


(62.2) 


(62.3) R 


aaßy — Rapyy = 0, Riz = Razi = — Royo = —Ry221- 


Hence every nonvanishing component of the Riemann tensor is equal to 
Riz Or to its negative. 
We define the quantity K by the formula 


(62.4) K = Suu 
a 


3 


where a = |a,,|, and call it the Gaussian curvature or the total curvature 
of the surface S. Since only metric coefficients a,, are involved in this 
definition, the properties described by K are intrinsic properties of the 


surface S. eee 
If we introduce the two-dimensional e-tensors, 
wis p 
M Gn — 4/0 Ga and e’ = —— 


i 
aie 


già mbe 7 
where the e,,’s are the alternating e-systems (see Sec. 40), and note 
relations 62.3, we can write equation 62.4 as 


(62.5) Ragys = Ke,pEy: 


Since «*e,, = 2, we can solve (62.5) for K and obtain 


(62.6) K = Rap”. 


These equations show that the Gaussian curvature is an invariant. 

Now, when a surface S is isometric with the Euclidean plane, there 
exists on S a coordinate system with respect to which ay, = 422. = |, 
a = 0. It is obvious that in this case R,p, = 0 in this particular coor- 
dinate system, and since Ryp 1S a tensor, it must vanish in every coordinate 
system. 

Conversely, if the Riemann tensor vanishes at all points of the surface, 
Theorem II of Sec. 39 guarantees that there exist coordinate systems 
on the surface such that a); = 422 = 1, a), = 0. 


168 GEOMETRY [CHAP. 3 


Thus we have a 

THEOREM. A necessary and sufficient condition that a surface S be 
isometric with the Euclidean plane is that the Riemann tensor (or the 
Gaussian curvature of S) be identically zero, 

Consider next an invariant 


(62.7) R=a"R,,, 
where 
(62.8) Rei Rig = BR 


is the Ricci tensor introduced in Sec. 38.18 
If we multiply (62.8) by a” and sum, we get 


(62.9) R= aR ae. 
and recalling (62.3) we see that (62.9) is equivalent to 
(62.10) R = Rena a” — aa"). 
Since 
ma D en au Gi — 4412 
a’ a’ a 
we have 
(62.11) IR = Eemi 
a 


Comparing (62.11) with (62.4) we see that 
R= —2K. 


The invariant R is sometimes called the Einstein curvature of S. 

We shall give a more revealing geometrical interpretation of the Gaussian 
curvature in Sec. 72, where the surface S is viewed from the enveloping 
space. 


Problems 


1. Use formulas 62.2 and 62.4 to show that, if the system of coordinates is 
orthogonal, then 


1 | ə ( 1 “j a ( 1 =) 
K = — —~| — | — — + —|-—=— . 
2Va\l ðu! \ Va ðu! du? \ Va d): 
2. Calculate the total curvature of the manifold whose quadratic form is 
ds? = q* sin? wè (du®)? + a?(du®}. 


18 We recall that Rž, = a"*R,,,,5. 


Sec. 63] GEODESIC CURVATURE OF SURFACE CURVES 169 


3. Determine whether the surface of a helicoid given by 
yt = ul cos u*, 
y? = u! sin uv’, 
yE = ë, 
is developable. 
4. Show that, for a surface of revolution defined by 
y! = ut cos u’, 
y? = w sin uv’, 
y? = f(u’), 
EE ee 
ao + FE’ 
5. Show that the surface defined by 


when f is of class Ca 


y! = f(u’), 
y = fau), 
oe us 


where f, and fù are differentiable functions, is developable. 
6. Show that the formula for the Gaussian curvature K can be written in the 
form 


K- 1 | a | a ĉan 1 re 
2Va (at Lay Va ou Va ðu! 
G] | 2 G 1 ôdi diz <n | 


— m a ed 


Va ðu Vad ayVa au 


ðu? 


63. The Geodesic Curvature of Surface Curves 


We shall conclude our study of intrinsic geometry of surfaces with a 
derivation of a formula describing the behavior of the tangent vector 
to a surface curve. This formula is analogous to Frenet’s formula 50.1. 

Let C be a surface curve defined parametrically by 


(63.1) u* = u"(s), 
where s is the arc parameter. Accordingly, at every point of the curve we 
have the condition 


(63.2) eee ob 


The quantities du'/ds, du®/ds obviously determine a tangent vector a 
to C, and it is clear from (63.2) that 


(63.3) x = 


170 GEOMETRY [CHAP. 3 


is a unit vector. If we differentiate the quadratic relation a,,4*/’ = 1 
intrinsically with respect to s, we obtain a,,4*(64"/6s) = 0, from which it 
follows that the surface vector ô2*/ôs is orthogonal to A*. Following the 
line of thought of Sec. 49, we introduce a unit surface vector 7* normal 
to A*, so that 


(63.4) ig EI 


és ue 


where x, is a suitable scalar. In order to determine the direction of y” 
uniquely we choose 7* in the way analogous to the choice of the triad 
of vectors in Sec. 49 (equation 49.11), namely, €,,4%7"° = 1. This choice 
of the orientation of A and y) uniquely determines the sign of x, and it 
amounts to saying that the sine of the angle between A and y is + 1. The 
vector 7* is the unit surface vector orthogonal to the curve C, and the 
scalar x, is called the geodesic curvature of C. 

We recall that the equation of the geodesic on S (see equation 60.4) 
can be written as 6A*%/ds = 0. Comparing this with (63.4) leads to the 
conclusion that, if the geodesic curvature x, = 0, then the curve C is a 
geodesic, and conversely. Hence the 

THEOREM. A necessary and sufficient condition that a curve on a surface 
S be a geodesic is that its geodesic curvature be zero. 

As an illustration, we compute the geodesic curvature of the small 
circle 

C Rae — constant S OR 0" =a, 


on the surface of the sphere (Fig. 26) 
S: y' = acosu' cos u* 
y? = a cos ut sin u? 
Y = asin ul. 


If the arc-length s of C is measured from the plane u = 0, we have 
u? = s/(a cos uy!), and the equations of C can be written in the form 


~, S 
(63.5) ulud, ue 
a COS Uy 


From (63.5) we find that the components of the unit tangent vector 
A* = du*/ds along C are 


(63.6) A* = 0, AP x —— , 
a COS Uy! 
so that 
à! då! | 1 du? 1 
EEE e acton aE 292 
Os ds H af) ds " Look? i 


SEC. 64] SURFACES IN SPACE 171 


y3 


x yg 
y? 
Fig. 26 
and 
YE ae | pivenaine 2 
— = — 47 — =0 AA. 
és ds ul ds Fs, 


Since metric coefficients of S are a,, = Ê, dy, = 0, az = a? cos? ut, we 


find, on referring to Problem 3 of Sec. 31, that K = sin u! cos ul, 


a 22 
| 2 a = 0. Accordingly, formulas 63.4 yield 


T 2 
(63.7) LAE N= es tan uy’, OR os co =A 
ôs a ds 
Since C is not a geodesic, x, # 0, and we conclude that n = 0! Buty 
is a unit vector so that a,,7%n° = 1, and we find that n? = 1/a. Hence 
equations 63.7 yield x, = (tan u,!)/a. In Sec. 71 we establish the relation 
between the geodesic curvature x, and the ordinary curvature x of C. 


64. Surfaces in Space 


With the exception of occasional references to the surrounding space, 
our study of geometry of surfaces was carried out from the point of 
view of a two-dimensional being whose universe is determined by the 
surface parameters u' and u’. The treatment of surfaces presented in 
the foregoing was based entirely on the study of the first quadratic 
differential form. In the discussion of isometric surfaces in Sec. 61, we 


172 GEOMETRY [Cuap. 3 


remarked that a pair of isometric surfaces, a cone and a cylinder for 
example, which are indistinguishable in intrinsic geometry, appear to be 
quite distinct to an observer examining them from a reference frame 
located in the space in which the surfaces are imbedded. An entity that 
provides a characterization of the shape of the surface as it appears 
from the enveloping space is the normal line to the surface. The behavior 
of the normal line as its foot is displaced along the surface depends on 
the shape of the surface, and it occurred to Gauss to describe certain 
properties of surfaces with the aid of a quadratic form that depends in 
a fundamental way on the behavior of the normal line. Before we 
introduce this new quadratic form, let us recall our point of departure 
in the study of surfaces in Secs. 52 and 53. 

A surface S imbedded in E£, was defined by three parametric equations 


(64.1) yt = y?’(ut, u), (= 1,2, 3), 


where the y* are orthogonal cartesian coordinates of the reference frame 
located in the space surrounding S. An element of arc ds of a curve 
lying on S is determined by the formula 


(64.2) asi = a,naueau, 
where 
2 ee 
“F Jut dub 


The choice of cartesian variables y* in the space enveloping the surface 
is clearly not essential, and we could have equally well referred the points 
of E; to a curvilinear coordinate system X related to Y by. the trans- 
formation x’ = z'(y’, y?, y?). Now, relative to the frame X, the line 


element in E; is given by 

(64.3) ds? = g; dx* dx’, 

where Ey = a. and site set of equations 64.1 for the surface S 
can be written as 

(64.4) S: 2 = ri (uiua). 


It follows from this representation of S that 


(64.5) dx = Ls du’, 
Qu” 


SEC. 64] SURFACES IN SPACE 173 


and hence the expression for the surface element of arc (64.3) assumes 
the form 


Ox Ox? : 
a du* du’ 
ET T 
A comparison of this with equation 64.2 leads to the conclusion that 
oriori 
4.6) dg = By, (i,j, = 1,29, p= D. 
du” du? 


We note that the foregoing formulas depend on both the Latin and 
Greek indices, and we recall that the Latin indices run from 1 to 3 and 
refer to the surrounding space, whereas the Greek indices assume values 
1 and 2 and are associated with the surface S imbedded in E£}. Further- 
more, the dx‘ and g,,’s are tensors with respect to the transformations 
induced on the space variables x', whereas such quantities as du* and a,, are 
tensors with respect to the transformation of Gaussian surface coordinates 
u*. Equation 64.6 is a curious one since it contains partial derivatives, 
dx'/du*, depending on both Latin and Greek indices. Since both a,, 
and g, in (64.6) are tensors, this formula suggests that dx*/du™ can be 
regarded either as a contravariant space vector or as a covariant surface 
vector. Let us investigate this set of quantities more closely. 

Let us take a small displacement on the surface S, specified by the 
surface vector du*. The same displacement, as is clear from (64.5), is 
described by the space vector with components 

i 
[64.5] dat = 2 du". 
ðu” 


The left-hand member of this expression is independent of the Greek 
indices, and hence it is invariant relative to a change of the surface 
coordinates u*. Since du* is an arbitrary surface vector, we conclude that 


GEM 

du” 
is a covariant surface vector. On the other hand, if we change the space 
coordinates, the du*, being a surface vector, is invariant relative to this 


change, so that (64.7) must be a contravariant space vector. Hence we 
can write (64.7) as 


(64.8) m = 


(64.7) 


ut” 
where the indices properly describe the tensor character of this set of 
quantities. 


174 GEOMETRY [Cuap. 3 


A simple geometrical significance of the set of quantities 64.8 can be 
deduced from Fig. 27. Let r be the position vector of an arbitrary point 
P on S. The point P is determined by a pair of Gaussian coordinates 
(ut, v2), or by a triplet of space coordinates (xt, x*, x°). Accordingly, 
the vector r can be viewed as a function of the space variables x‘ satisfying 
equations 64.4. Thus 


(64.9) or Or Ox’ 


Ou 7 Ox Ju” 


But or/dx* are the base vectors b, at P, associated with the curvilinear 
system X, whereas dr/du* are the base vectors a, at P relative to the 
Gaussian system U. 

Hence equations 64.9 yield 


(64.10) a 
ðu” 
: s ; Ox' _ Ox" 
It is clear from this representation that a, = b,, so 
Qu! = J? 


that dx'/du* = x,*, (« = 1,2), are the contravariant components of 
the surface base vectors a, referred to the base systems b,. Thus the 
sets of quantities 


a (Z. Ox” 2) T a: (%, Ox? =) 
i Jut’ Jul’ du Neu? "Bie ðu? 


Fig. 27 


SEC. 65] THE NORMAL LINE TO THE SURFACE 3) 


transform in a contravariant manner relative to the transformation of 
space coordinates x’. 
We can also show that the three surface vectors 


p. (= =) A (= =) F (= z) 
EN er | E a a A 
Qu’ Qu ðu! Ou Ou’ ðu 


transform according to the covariant law with respect to the transforma- 
tion of Gaussian surface coordinates u*. Indeed, consider a transformation 
u* = u*(i!, a”); then the equations 64.4 of S go over into siriu, u”), 
and 

Oxi GEF du? 

ðu” du’ u" 

But dx‘/da? = xi, and (64.11) yields, for i = 1, 2, 3, a! = (Oaou). 
This is the covariant law. 

Let ds be an element of arc joining a pair of points P(w, u’) and 
P(ut + du}, u? + du?) on S. The direction of the line element ds is given 
by the direction parameters du*/ds = A*. The same direction can be 
specified by an observer in the enveloping space by means of three param- 
eters dx'/ds = 1’, and it follows from (64.5) that 2° = xiA*, This formula 
tells us that any surface vector A” (that is, a vector lying in the tangent 
plane to S) can be viewed as a space vector with components A* determined 
by 
(64.12) A= A 
We shall refer to a vector A‘ determined by this formula as a tangent 
vector to the surface S. 


(64.11) 


65. The Normal Line to the Surface 


Let A and B be a pair of surface vectors drawn at some point P of S 
(Fig. 28). According to formula 64.12, they can be represented in the form 
(65.1) ASAD. Ba Be 
The vector product A x B is the vector normal to the tangent plane 
determined by the vectors A and B, and the unit vector n perpendicular 
to the tangent plane, so oriented that A, B, and n form a right-handed 
system, is 
AxB AxB 

65.2 n = ——— = >? 
i?) JA x B| AB\sin 6| 
where 9 is the angle between A and B. 

We call the vector n the unit normal vector to the surface S at P. Clearly, 

n is a function of coordinates (u’, u’), and, as the point P(u’, u’) is 


176 GEOMETRY [Cuap. 3 


Fig. 28 


displaced to a new position P(u’ + du’, u? + du’), the vector n undergoes 
a change 


(65.3) dn = — du’, 


or 
whereas the position vector r is changed by the amount dr = ae du”. 
Let us form the scalar product 


(65.4) dia a, E du" Jut. 
If we define 
A (2. or , on oO i 
oe ut du’ ' du? aut 
so that (65.4) reads 
(65.5) dn- dr = —b,, du* du’, 


the left-hand member of (65.5), being the scalar product of two vectors, 
is obviously an invariant; moreover, from symmetry with respect to « 
and $£, it is clear that the coefficients of du” du’ in the right-hand member 
of (65.5) define a covariant tensor of rank two. The quadratic form 


(65.6) ` B = by, du” du? 


will be shown to play an essential part in the study of surfaces when 
they are viewed from the surrounding space, just as the first fundamental 
quadratic form £ = dr- dr, or 


A = a, du" du? 


SEC. 66] TENSOR DERIVATIVES 177 


did in the study of intrinsic properties of a surface. The differential 
form 65.6 was introduced by Gauss, and it is called the second fundamental 
quadratic form of the surface. 

Since the notation for the unit normal used previously, “despite its 
pictorial suggestiveness, is more cumbersome than the tensor notation, 
we shall rewrite the defining formula 65.2 in terms of the components 
x! of the base vectors a,. We denote the contravariant components of n 
by n’ and observe that its covariant components n, are given by’® 


(65.7) n, = fiA B" 

' ABsinð 
and (see Sec. 54) 
(65.8) AB sin 0 = e,pA"B’. 


Substituting in (65.7) from (65.1) and (65.8), we get 

(N;€ag — E eT T ASB? = 0, 
and, since this relation is valid for all surface vectors, we conclude that 
(65.9) MERZ € te 5- 


Multiplying (65.9) through by «*’, and noting that efe p = 2, we get the 
desired result 


(65.10) n, = hee 0225. 


It is clear from the structure of this formula that n, is a space vector 
which does not depend on the choice of surface coordinates. This fact 
is also obvious from purely geometric considerations. 


66. Tensor Derivatives 


In Sec. 67 we shall deduce the second fundamental quadratic form 
65.6 analytically by the operation of tensor differentiation of tensor fields 
which are functions of both surface and space coordinates. The fruitful 


19 Note that the vector product A x B depends on the lengths of the vectors A and B 
and on the angle between them. If we choose an orthogonal cartesian system of axes 
Y, so that the vectors A and B lie in the Y? Y?-plane with A directed along the Y-axis, 
then the cartesian components A‘ of A are A’ = A, A? = 0, A? = 0, and the components 
of B are B! = B cos 0, B? = Bsin 0, B? = 0. Since in the Y-system ejz = eijk 


C; = cir A B* = enAB sin 0. 


Hence C, = 0, C: = 0, C; = ABsin 6. Thus the C; define the vector C = A X B 
normal to the plane determined by A and B whose magnitude is AB sin 6]. If A* and 
BB are the surface components of A and B, then ABsin =e agâ*BP. This result follows 
immediately from the formula for the sine of the angle between two vectors given in 
Sec. 54. 


178 GEOMETRY [CHap. 3 


concept of tensor differentiation was introduced by A. J. McConnell, 
whose elegant treatment of surfaces is followed closely in this and several 
other sections of this chapter.?° l 

Let us consider a curve C lying on a given surface S and a vector A’ 
defined along C. If tis a parameter along C, we can compute the intrinsic 
derivative 6A‘/6t of A*, namely, 


ôA‘ _ dA’ {ila 


66.1 : 
Cin ôt dt = a\jk dt 


In formula 66.1 the Christoffel symbols | A refer to the space coordinates 
g\ JK, 


x? and are formed from the metric coefficients g, This is indicated by 
the prefix g on the symbol. On the other hand, if we consider a surface 
vector A* defined along the same curve C, we can form the intrinsic 
derivative with respect to the surface variables, namely, 


(66.2) 


ee ae 
ôt dt alpy dt 


In this expression the Christoffel symbols a are formed from the 


By 


metric coefficients a,, associated with the Gaussian surface coordinates u*. 
A geometric interpretation of these formulas is at hand when the fields 
A‘ and A’ are such that 64‘/dt = 0 and 6A*/dt = 0. In the first’ equation 
the vectors A‘ form a parallel field with respect to C, considered as a 
space curve, whereas the equation 6A%/dt = 0 defines a parallel field with 
respect to C regarded as a surface curve. The corresponding formulas 
for the intrinsic derivatives of the covariant vectors A; and A, are 


ôA; dA, ("| dx? 
66.3 E ss SS A, — 3 
o Oo dt aij) ai 
and 
p 
(66.4) OAy  dAg _ {7 14, 
ôt dt «laß dt 


Consider next a tensor field 7,', which is a contravariant vector with 
respect to a transformation of space coordinates x‘ and a covariant vector 
relative to a transformation of surface coordinates u*. An example of a 
field of this type is a tensor x,' = dx'/Ou* introduced in Sec. 64. If T tis 
defined over a surface curve C, and the parameter along C is t, then T, t 
is a function of t. We introduce a parallel vector field A; along C, regarded 


» A. J. McConnell, Absolute Differential Calculus, Chapters XIV-XVI. 


SEC. 66] TENSOR DERIVATIVES 179 


as a space curve, and a parallel vector field B“ along C, viewed as a surface 
curve, and form an invariant 


D(t) = TiA,B*. 


The derivative of (z) with respect to the parameter ¢ is given by the 
expression 


(66.5) 


v 


dð dT, ; dA; pp CUEN 

— = — A,B* + Tf — B7 + TA; : 

dt dt ' * dt at 

which is obviously an invariant relative to both the space and surface 
coordinates. But, since the fields A,(t) and B%(t) are parallel, 


i a 
a e 
dt o\ij dt dt aly dt 
and (66.5) becomes 
i ) o Ja Aut 
(66.6) dD _ [Z + Moa oz alae 
dt dt a\jk) det alay) ° dt 


Since this is invariant for an arbitrary choice of parallel fields A; and B”, 
the quotient law guarantees that the expression in the brackets of (66.6) 
is a tensor of the same character as 7,’. We call this tensor, after 
McConnell, the intrinsic tensor derivative of T,* with respect to the param- 
eter ¢, and write 


Ghd (i \agds fodi 
66.7 ezt} | fee | Do 
oo) Be dp elk” de alg)” d 


If the field 7,‘ is defined over the entire surface S, we can argue that, 
since 


tos ee- eE 
ôt Ou? ljk) * ” alya >I dt 


is a tensor field and du’/dt is an arbitrary surface vector (for C is arbitrary), 
the expression in the bracket is a tensor of the type T,,,".. We write 


BROT i ô 
i = a m ko | Ir, i 
Gaa Tay ou’ t i zi alay £ 


and call TŻ, the tensor derivative of T? with respect to w. 
The extension of this definition to more complicated tensors is obvious 
from the structure of the formula 66.8. Thus the tensor derivative of T,,* 


with respect to u” is given by 


wlth [iran Laa- fe 
(66.9) aß 7 ana oljk apy alay bp al By að 


180 GEOMETRY [CHap. 3 


If the surface coordinates at any point P of S are geodesic, and the 
space coordinates are orthogonal cartesian, we see that at that point the 
tensor derivatives reduce to the ordinary derivatives. This leads us to 
conclude that the operations of tensor differentiation of products and 
sums follow the usual rules and that the tensor derivatives of Z; a.g» 
Eix €ag and their associated tensors vanish. Accordingly, they behave 
as constants in the tensor differentiation. 


67. The Second Fundamental Form of a Surface 


The apparatus developed in the preceding section permits us to obtain 
easily and in the most general form an important set of formulas due 
to Gauss. We will also deduce with its aid the second fundamental 
quadratic form of a surface already encountered in Sec. 65. 

We begin by calculating the tensor derivative of the tensor x,*, repre- 
senting the components of the surface base vectors a,. We have 


gxi | a ee | ala 
67.1 xi g = — ir i 
( ) Be du“ du” H g ME 4 a ME 
from which we deduce that 
(67.2) wep or 


Since the tensor derivative of a,, vanishes, we obtain, upon differentiating 
the relation : 


[64.6] dap = Sista 

(67.3) BisFaytp + 8istety,, = O. 
Interchanging «, ĝ, y cyclically leads to two formulas: 
(67.4) Bith ot, + 840). = 0, 
(67.5) Eirth 9%) + Bitit p = 


If we add (67.4) and (67.5), subtract (67.3), and take into account the 
symmetry relation (67.2), we obtain 


(67.6) Be = 0: 


This is the orthogonality relation which states that xi ¿ is a space vector 
normal to the surface, and hence it is directed along the unit normal ni. 
Consequently, there exists a set of functions b,, such that 


(67.7) vp => bagn’. 
231 Compare A. J. McConnell, Absolute Differential Calculus (1931), p. 200. 


SEC. 67] THE SECOND FUNDAMENTAL FORM 181 


The quantities b,, are the components of a symmetric surface tensor, 
and the differential quadratic form 


(67.8) B = by, du* du? z 


is the desired second fundamental form. 


To demonstrate the equivalence of this definition of the tensor b,, 
with that given in Sec. 65, namely, 


E or on | 
a3 > 


E aut 


note that the vectors n and a, = Or/du* are orthogonal and hence 


n-— =0 and n-— = 0 


Differentiating these two scalar products with respect to u° and u%, 
respectively, and adding, we get 


T T 
2\ðu* uf u! ðu” Ou Ju! 
‘Hence 
ər 
ine 
aigi g du“ du” 
However, 
a =a = b,x; 
Ou" 
therefore 
or Oxi Ob; ; 
= — 7 
JuJu? `u’ t ðu! * 
dx ob ind 
= bizze + Ox! Lath 


wu 
E ee eee 
F o a 


where, in the last step, we made use of formula 46.4 for the derivative of 
the base vector b;. l l 

If we insert in the right-hand member of the foregoing expression from 
equation 67.1, we get 


(67.10) aa v(i + tel): 


ðu” ou? 


182 GEOMETRY [CHaP. 3 


Multiplying equation 67.10 scalarly by n, and observing that the vectors 
b,x; = a; and n are orthogonal, we get 


2. 
2 fa) E = n- b,x, B 
du Ju” : 
= Ta, Bhi 
= bag 


by formula 67.7. This establishes the equivalence of the two definitions 
of the second fundamental quadratic form. 

Equations 67.7 are known as the formulas of Gauss. The importance 
of the form 67.8 in differential geometry stems from the fact that the 
tensors a,, and b,,, satisfying equations of Gauss and Codazzi (to be 
derived in Sec. 69), determine the surface to within a rigid body motion 
in space. 

Problems 
1. Show that b g =t gi? = pe eit gte- 
2. Show that, in the notation of Sec. 65, 


1/on ðr õa ôr 
bas = — Dane” aah * ab Gus) ~ E 

where n is the unit normal and r is the position vector of the point on the surface. 

3. When the equation of a surface S, referred to a set of orthogonal cartesian 
axes, is taken in the form y? = f(y", y?), we can write the parametric equations 
of S in the form y! = wl, y? = u?,y? = f(u, u?). If partial derivatives of 
fy’, y*) are denoted by fn = p, fp =q, fpy =O fry = S, fy = t, then the 
coefficients Gy, in ds? =a, g du“ du® are 

ay = 1 + p*, Ajo = pq, ; d9 = 1 + q’, 
whereas the coefficients b, B of the second fundamental form are 
r b S b t 
Vi¢p+q a vi S Ar 
Show this and compute a.g and b,; for the surface of the sphere 
y? = Væ — (P — F. 
4. If equations of S are 


by = 


S: yt = y?(ul, u?), 
where the y? are enka cartesian coordinates, and r is the position vector 


of the point (y', y?, y’) on S, then on using subscripts to denote partial deriv- 
atives, 


aup = rya °T,B, bag = [r uey ‘Tyi X r„2]/ Va. 


Show this. Use these formula to ee the a a and o for the surface of 
revolution y! = u cos wv, y? = u! sin u°, y? = f (uò). 


SEC. 68] THE INTEGRABILITY CONDITIONS 183 


68. The Integrability Conditions 


In order to get insight into the significance of the tensor b,ș let us 
examine more closely the Gauss formulas i 


(68.1) xag = bagh?, 
where 
; ori i : See 
[67.1] TR | | | 
id ou" ou? ji g jk k a a ap pe 
and 
[65.10] n; = fe eila Tp» 


with z,) = dx'/du*. 

If we insert these expressions in equation 68.1, we obtain a set of 
second-order partial differential equations, in which the dependent vari- 
ables x‘ are functions of the surface coordinates u*. The coefficients in 
these differential equations are functions of metric coefficients g;; of the 
manifold in which the surface S, defined by 


(68.2) oe = Tie), = 1-23), 


is immersed; they are also functions of a,, = i,(Ox'/Ou*)(Ox7/du"), and bag- 
If equations 68.2 are given, we can compute 4,, and b,, (see Problem 1, 
Sec. 67), insert the appropriate expressions in (68.1), and, of course, 
equations 68.1 will be satisfied identically. On the other hand, if the 
functions a,, and b,, are prescribed in advance, equations 68.1 will become 
equations of conditions, and in general they will have no solutions yielding 
equations 68.2 of the surface S. In order that the tensors a,, and b,, be 
related to some surface, it is necessary that the z* satisfy the integrability 
conditions, 
2 t 2, t 
(68.3) akepe Og, 


Ou’ du’ Ju! au?’ 


whenever the functions x,‘ are of class C?. From our discussion of 
inversion of order of covariant differentiation in Sec. 36, it follows that 
the condition 68.3 is equivalent to” (cf. equation 36.6) 


(68.4) ae — ee = RE En 


22 We dispense with the details of computation since they are not essential to the 
course of argument. See, for example, A. J. McConnell, Absolute Differential Calculus, 


p. 203. 


184 GEOMETRY [CuapP. 3 


where R®,,, is the Riemann tensor of the second kind, formed with the 
aid of the coefficients a,, of the first fundamental quadratic form. Equa- 
tions 68.4 involve third partial derivatives of the coordinates x‘, and we 
shall assume from now on that the functions entering in (68.2) are of 
class C?. 

We shall see that the conditions of integrability 68.4 impose certain 
restrictions on the possible choices of functions 6,, and a,;. These 
restrictive conditions are known as the equations of Gauss and Codazzi. 


They will be derived in the following section. 


69. Formulas of Weingarten and Equations of Gauss and Codazzi”? 


In order to derive the equations of Gauss and Codazzi we need an 
auxiliary result, due to Weingarten, giving the expressions for the deriva- 
tives of the unit normal vector n’ to S. We begin with the relation 
g,jn'n’ = 1, and form its tensor derivative. We have 


Eishaan’ F Einna = 0, 
or 


(69.1) cin, — 0. 
Equation 69.1 shows that n’,, considered as a space vector, is orthogonal 


to the unit normal n’, and hence it lies in the tangent plane to the surface. 
Accordingly, it can be represented as a linear form in the base vectors zt, 


(69.2) a = 


oa 


Since n’ is normal to the surface, we have the orthogonality relation 
£,;0,n' = 0, whose tensor derivative is 


(69.3) Bixa pn + Suan’, = Q). 
But, from (68.1), 
(69.4) ae = bypn', 


so that the substitution from (69.4) and (69.2) in (69.3) yields 
bap + 8i5%,%3ch = O, 
and, since a,, = git, £g, we have 
bap = —AgyCp’. 


2 The treatment given here is patterned after A. J. McConnell, Absolute Differential 
Calculus, pp. 201-205. 


dn? i\ 
24 We recall that n,a = — + nist. 
ðu* | jk 


SEC. 69] FORMULAS OF WEINGARTEN 185 


Solving this equation for c,”, we get 


Cp” = —a™ b.p, 
so that equation 69.2 reads : 
(69.5) mea a o 


These are the Weingarten formulas which we will use in deriving the 
Codazzi equations. 


The equations we desire follow from the integrability conditions 68.4, 
namely, 


(69.6) e T3, ve = R? apy” 


We form the tensor derivative of equation oe and use 69.5 to obtain 
(69.7) wi ay = bapan + Gans 
sheet a o 


Substituting from (69.7) in the left-hand member of (69.6), we get 


i ys 2A 
x! gy — Ti yp = (bap,y — Day p)n — a°% bagbay — baybop)ta- 
Hence 


(69.8) (bape > E z a” (bagbo; = bay bsp) ti ar Rpr- 


To obtain the equations of Codazzi we multiply (69.8) by n,, and, since 
xin, = 0, we get the desired result 


(69.9) E TE 


aß.y 


To obtain the equations of Gauss we multiply (69.8) by g,) and obtain 
(69.10) bpgbay — bpoybag = Roapy: 


Since «, 8 assume values 1,2 and b,, = bga we see that there are 
two independent equations of Codazzi and only one independent equation 
of Gauss.2° The independent equations of Codazzi are 


(69.11) baag — Dapa = O, (a # B), (no sum on 4), 


or, when the covariant derivatives are written out in full with the aid of 
ðb ô ô 
b, = of C | ) b = | |o > 
ae <i ay ceo Pe By = 


Raapy = Rapyy = 0, Rina = Raa = —Reua = —Rj221- 


25 We recall that 


186 GEOMETRY [CHAP. 3 


we get 
ô 
(69.12) baa = Obap 5 bas 2 | =e bas = 0, a Bs 
ðu? Ou” ap xa 
(no sum on g). 
The equation of Gauss, on the other hand, is 
(69.13) biba — bio = Rine 


This equation relates the coefficients b,, and a,; in the two fundamental 
quadratic forms. 

The foregoing demonstration shows that if the tensors a,, and 6,, are 
the fundamental tensors of the surface S: x? = x*(u’, u?), then equations 
69.11 and 69.13 are satisfied. Conversely, it can be shown that if the two 
sets of functions a,, and b,, satisfying equations 69.11 and 69.13 are 
prescribed, and if a,, du* du’ is a positive definite form, then the surface 
S is determined (locally) to within a rigid body motion in space. The 
proof% of this depends on considering the existence of a solution of a 
system of differential equations of the type discussed in Sec. 39. We 
conclude by remarking that if b,, = 0, then by 65.5 the surface is a plane. 


70. The Mean and Total Curvatures of a Surface 


If we recall the definition 62.4 of the total curvature K, 


[62.4] K= Rize @ = Cla — A412, 


a 


we can write equation 69.13 in the form 


= byyb2e = bis” 


2 
411499 — Aig 


(70.1) K ae : 

a 
Thus the Gaussian curvature is equal to the quotient of the discriminants of 
the second and first fundamental quadratic forms. 

We introduce next another important invariant H, called the mean 
curvature of the surface. This is given by the formula 


(70.2) H = ha"b,,, 


and we shall see in Sec. 72 that the invariants K and H are connected in a 
remarkable way with the ordinary curvatures of certain curves formed by 
taking normal sections of the surface. 


** For a detailed discussion, see L. P. Eisenhart, Introduction to Differential Geometry, 
pp- 218-221, where the case of cartesian variables x‘ is considered. 


SEC. 71] CURVES ON A SURFACE 187 


71. Curves on a Surface. Theorem of Meusnier 


Let equations of a smooth curve C lying on the surface 


(71.1) eet =i) 
be given in the form 
(712) Ca i= As): 


where s is the arc parameter. If the values of u”*(s) are inserted in (71.1), 
we obtain the space coordinates x’ of C in the form 


(71.3) A ce (5): 


These are the equations of C, regarded as a space curve. The properties 
of C can then be studied with the aid of the Frenet-Serret formulas 50.1, 
50.2, and 50.3 by analyzing the rates of change of the unit tangent vector A, 
the unit principal normal p, and the unit binormal v. 

On the other hand, if we regard C as a surface curve, defined by (71.2), 
the components 4* of the unit tangent vector A are related to the space 
components A‘ of the same vector by the formulas 


on aus i 
71.4 ï= ZTA, 
a ðu” ds 
where 
. dæ du“ 

TES A=— and = : 
7° ds ds 
We also recall equation 63.4, 

62" 
(71.6) = %,y", 

os 


where 7” is the unit normal to C in the tangent plane to the surface, and 
2, is the geodesic curvature of C. (See Fig. 29.) 


INL 


Fig. 29 


188 GEOMETRY [CHap. 3 


If we differentiate (71.4) intrinsically with respect to s, we obtain 


which, upon taking into account Frenet’s formula 50.1 and equation 
71.6, becomes 

set! E E A een. 
The space components 7’ of y are nê = x,‘n*, and, if we recall Gauss’s 
formula zi , = b,,n', the foregoing equation becomes 


(71.7) nu = bap Afni + xn", 
where n* is the unit normal to the surface S. 
Formula 71.7 states that the principal normal p to C lies in the plane 


of the vectors n and y). Since n, n, and A are orthonormal and n x n =A, 
we have 


(71.8) Einn = A,, 
and, since A is orthogonal to the plane of n and u, wp x n = —sin OA, or 
(71.9) egint = — sind: A,, 


where 0 is the angle between u and n. 
On multiplying (71.9) by x we get 


cuin" = — x sin 0 - À; 
which, on substitution from (71.7), yields 
Eijl bap ifn? + xg )nă = — xsin B+ Ay. 
But enin" = 0 and e,;,7’n* = — À; by (71.8), and we conclude that 
O) %, = xsin 6. 


On the other hand, if we form the scalar product of both members of 
equation 71.7 with n; and note that n,u' = cos 0, we get 


(71.11) PERG = E. 


The invariant b,,A*A” in (71.11) has the same value for all curves on S 
with the same tangent vector A at P. In particular, it has this value for the 
curve formed. by the intersection of the normal plane containing n and A. 
But for every normal plane section the angle 6 is either 0 or z radians, 
so that for the normal plane section x cos 0 = x or —x; since the right- 
hand member of (71.11) is an invariant, the value of x cos @ for every 
curve C tangent to A is equal to the curvature x,, of the normal section 


SEC. 71] CURVES ON A SURFACE 189 


in the direction A. The curvature x;,) is called the normal curvature of 
the surface S in the direction A. We can thus write (71.11!) as 


(71-12) Hin) = be AA’ 


where xin = x cos 0. Accordingly, Eq. 71.7 can be written as 


xp? = Xinh + 2x57". 
This equation states that x,, and x, are the components of the curvature 
vector xu‘ in the directions of the vectors n‘ and 7‘. 
The result embodied in formula 71.12 can be stated as 
MEUSNIER’S THEOREM. The radius of curvature R = 1/x of any curve 
at a given point on the surface is equal to the product of the radius of curvature 
Riny = l/xin) Of the corresponding normal section at that point by the 
cosine of the angle between the normal to the surface and the principal 
normal to the curve. 
In symbols, we have 
R = +R,,) cos 0. 


If S is a sphere, every normal section is a great circle of the sphere, 
and if C is any circle drawn on the sphere, then the preceding result 
becomes obvious from elementary geometric considerations. (See Fig. 30.) 

If we recall that ds? = a,, du” du’, and du*/ds = A*, we see that formula 
71.2 can be put in the form 

_ bap du* duf _ B 
(71.13) Xin) oe a. 
We note that, if the surface is a plane, the normal curvature xp) = 0 at 
all points of the plane, and if it is a sphere, *im = 1/R, where R is the 
radius of the sphere. Accordingly, we conclude from (71.13) that for the 
plane b,, = 0, and for the sphere b,, du* duf = (1/R)a„ du” du? so that 
a; = Rb,, at all points of the sphere. 


Fig. 30 


190 GEOMETRY [CHARIS 


Problems 


1. Prove that the geodesic curvature x, and the curvature x of any surface 
curve C are connected by the formula x, = x sin 0, where ô is the angle between 
the normal to the surface and the principal normal to C. 

2. Consider the surface of the right circular cone 


S: y! = ul cos È, 
> 2 
y? = uw sin us, 
y? = w, 
and the curve 


C: w =a, uw = u2 on S. 


Write equations of C in the form ul = a, u® = s/a, where s is the arc parameter 
and show with the aid of (71.6) that x, = V2/2a. Verify this result by (71.10). 
3. Show that the parallels u! = constant on a sufficiently smooth surface of 

revolution 

y! = u! cos u?, 

y? = u! sin u? 

y? = f) 
are curves of constant geodesic curvature. 

4. A curve C on a surface S is called an asymptotic line if b,,4*2° = 0 along 
C. Show that the principal normal u to the asymptotic line is tangent to S and 
the binormal v is normal to S. 

5. Show that the normal curvatures in the directions of the coordinate curves 
are b,,/a4; and byo/ doo. ; 

6. Prove the theorem: If a curve is a geodesic on the surface, then either it isa 


straight line or its principal normal is orthogonal to the surface at every point, 
and conversely. 


72. The Principal Curvatures of a Surface 


We will be concerned in this section with the determination of directions 
4* = du*/ds on the surface such that the normal curvature x;,,), given by 
the formula 
2] “Ren = rE 
assumes an extreme value. 

Since the vector A* is a unit vector, ;,,) in (71.12) has to be maximized 
subject to the constraining relation 
(72.1) ann — 1, 

Following the usual procedure of determining constrained maxima and 
minima, we deduce that a necessary condition for an extremum is 


(72.2) bap? + Aaggd? = 0, 


SEC. 72] PRINCIPAL CURVATURES OF A SURFACE 191 


where A is the Lagrange multiplier. If equation 72.2 is multiplied by 
A* and account is taken of relations 71.12 and 72.1, it follows at once 


that A = —x,,). Thus equation 72.2, for the determination of directions 
yielding extreme values of xın» can be written as 
(72.3) (bap — map)? = 0, (a = 1,2), 


The set of homogeneous equations 72.3 will possess nontrivial solutions 
for 4° if, and only if, the values of x;„ are the roots of the determinantal 


equation 
(72.4) |bag — Pagg| = O. 


The quadratic equation 72.4, when written out in expanded form, is 


(72.5) # — a%b,,6 +2 =0, 
a 


where b = |b,,| and a = |a,,|. 
Since the Gaussian curvature K is given by 


[70.1] => 
and the mean curvature H is 

[70.2] H = a y 

we see that equation 72.5 assumes the form 
(72.6) 8 — 2H + K =Q. 


The roots @ = xa and } = x; of (72.6) are called the principal curvatures 
of the surface, and the directions 4%, and 4%), corresponding to these 
extreme values of xı, are the principal directions on the surfaces. We 
leave it for the reader to show that these directions are real. 

From (72.6) it is clear that the principal curvatures xa) and x% are 


related to the mean and Gaussian curvatures by the formulas 
(72.7) ia + xo = 2H, 
xaxa = K. 


From equation 72.3 it follows that the principal directions are determined 
by 

a — xmas) = 0, 

(bag — H(2)Fap) Atay = O. 
If the first of these equations is multiplied by 4%), the second by 4%), 
and the results subtracted, we obtain 


(72.8) (x2) — (1) GaphtryAte) = 0. 


192 GEOMETRY [CHAP. 3 
If xu) A x, equation 72.8 tells us that 
(72.9) Gap Ai) Aio = 0, 


that is, the principal directions are orthogonal. If the extreme values of 
%(n) are equal at a given point, then every direction is a principal direction. 

We can summarize these results as a 

THEOREM. At each point of a surface there exist two mutually orthogonal 
directions for which the normal curvature attains its extreme values. 

A curve on a surface such that the tangent line to it at every point is 
directed along a principal direction is called a line of curvature. The 
differential equation for which the lines of curvature on S are the integral 
curves follows directly from equations 72.3. If we eliminate x,,, from 
these equations and set 2° = du®/ds, we get 


bag du” _ bag du 


a,,du® Cay 
or 


(72.10) (bi1đ12 — 5y24y)(du!)? + (011422 — b2241;) du' du? 
+ (Dy2d22 — @2b29)(du*)’ = 0. 


At each point of S where either b,, du” du? # 0 or b,;du* du? is not 
proportional to a,,du* du", equation 72.10 specifies two orthogonal 
directions 


du? teers 
(72.11) — a uth), ce de 
u 


which coincide with directions of the principal curvatures.?”? Each equation 
in (72.11) determines a family of curves on S covering the surface without 
gaps. These two families of curves are orthogonal, and, if they are taken 
as a parametric net on S, the first fundamental form has the form 


(ds)? = Gy, (di)? + dy9(dui*)?. 
Accordingly, equation 72.10 in the coordinate system a takes the form 
—by94,,(dit!)? + (bão — bo24,,) dit dā? + b,.4,(di?)? = 0, 
and its solutions are 
ü! = constant, ü? = constant. 


If we take di! # 0 and dū? = 0, we see that b, = 0, since ā Æ 0. Thus 
a necessary condition for the net of lines of curvature to be orthogonal 


27 We exclude those points on S at which x, = 0 or xa) = xa. See concluding 
remarks in Sec, 71. 


SEC. 72] PRINCIPAL CURVATURES OF A SURFACE 193 


is that a = b = 0. Conversely, if ai = bie = 0, then (72.10) has 
solutions u? = constant, u? = constant, so that the coordinate lines are 
the lines of curvature. Hence a 

THEOREM. A necessary and sufficient condition for the coo¥dinate net 
on a surface S (other than a plane or a sphere) to be the net of lines of 
curvature is that aix = biz = 0 at all points of S. 

We note that for every orthogonal net on a plane or a sphere a, = 
by. = 0. 

Formula 71.9, for the normal curvature x,,,, when the coordinate 
system is taken to be the net of lines of curvature becomes 


— baldu? + baldu’ 
ay,(du")? + (du)? 

If we set du! = 0, du? Æ 0, and du? = 0, du! ¥ 0, we get 

buy baz 


xı = ; Xo = 
ay Age 


(n) 


for the curvatures of the coordinate lines u, = constant and u, = constant. 

The lines of curvature on S should not be confused with the normal 
sections of S. The normal sections C,, are necessarily plane curves, 
whereas the lines of curvature ordinarily are not plane curves. 

We conclude this section by giving several definitions. 

A surface at all points of which the Gaussian curvature K is positive 
is called a surface of positive curvature. In this case (see equation 70.1), 
bib — biz? > 0, and, since xim = b,,A%A", we see that the principal radii 
Rin = 1/;,) to all normal sections of a surface with positive curvature 
do not differ in sign. If K < 0, at a given point, the principal radii differ 
in sign. Then the equation 


(72.12) baht? = 0 


defines two directions for which the radii of curvature are infinite. A 
surface at all points of which K < 0 is called a surface of negative curvature. 
If K =0 at a given point, the directions given by (72.12) coincide, and 
for this direction R is infinite. . 

From geometrical considerations it is clear that ellipsoids, biparted 
hyperboloids, and elliptic paraboloids are surfaces of positive curvature. 
Hyperboloids of one sheet and hyperbolic paraboloids are surfaces of 
negative curvature. 

A point on S is said to be elliptic if the signs of the principal curvatures 
žu) % are the same. It follows then that x, at an elliptic point does 
not change sign for any direction of the normal section. A point is 
hyperbolic if x) and x have opposite signs. At a hyperbolic point there 


194 GEOMETRY [CHaP. 3 


are two directions for which x, = 0. A point is parabolic if one of the 
values xa) OF x is zero. In the special case xa) = %2), all values of xen 
are equal and such points are called spherical. In the neighborhood of 
a spherical point the surface looks like a sphere, and we can prove that 
if all points of S are spherical, then the surface S is a sphere. In some 
books spherical points are also called umbilical. 


Problems 
1. Given an ellipsoid of revolution, whose surface is determined by 


ied lgj 
y! = a cos u' sin u?, 
ee a 

y? = asin ë sin uv’, 


y? = c cos uv’, aie 

show that 
Qa sin? u,  @a=0, “ag =a cos? u + esin? u’, 
2 2 
ac sin? u ac 
: 12 , 22 a a M 
Va cos? u2 + c? sin? u? Va cos? u? + c2 sin? u 
and 


c? 


KEE EK 2 SSS es 
"(2)" (a? cos? u2 + c? sin? u?) 


Discuss the lines of curvature on this surface. 
2. Find the principal curvatures of the surface defined by 


y =u, 

y= i, 

y? = fu, u). 
3. Show that the helicoid 

yl = u! cos 22, 

y? = u sin u?, 

y3 = au? 


is a surface of negative curvature. 
4. Show that when at all points of the surface b,s du* du? is proportional to 
axş du* duf, thenb,, = ka,,, k = constant. Interpret this result geometrically. 
5. Prove that if every point of the surface S is parabolic, then S is developable. 
6. Given a surface of revolution S, 


y! =rcosġ, y®=rsing, y® = f(r), 


with f(r) of class C?. Prove that the lines of curvature on S are the meridians 
$ = constant and the parallels r = constant. 

7. Refer to Problem 6 and show that the points on a surface of revolution 
S for which f’f” > 0 are elliptic; those for which f‘f” < 0 are hyperbolic; and 
if f” = 0, then S is a cone. 


Sec. 73] PARALLEL SURFACES 195 


8. Let the vector equation of a curve C drawn on a smooth surface S be 
r =r(s). If n(s) is a unit normal to S at a given point of C, and v is a parameter 
measuring the distance along n, the vector equation R(s, v) = r(s) + vn(s) 
defines a ruled surface S’. Prove that S’ is developable if C is a line of curvature, 
and conversely. Outline of solution: Denote the coefficients in the second 
fundamental form of S’ by dg: Compute the d,, from the formula 


R 
dap = mat N 


where v! = s, v? = v, and N is the unit normal to S’. Show that dẹ = 0. The 
developability condition dyd» — dı = 0 implies then that dı = 0. But, along 
C, N = dr/ds x n; hence 
dn dr 
dig A: X mi= 0] 

If dn/ds = 0 along C, then S’ is a cylinder, which is a developable surface. If 
dn/ds and dr/ds are collinear, then dn = k dr, which leads to the set of equations 
(bag — ka,,) du? = 0 of the type (72.3). Retrace steps to prove the converse. 

9. Let C be a smooth curve defined by r = r(s). The tangent surface S to C 
is defined by R(s, v) = r(s) + v(dr/ds), v is a parameter measured along the 
tangent dr/ds. Prove that S is developable. The curve C is called the edge of 
regression for S. 

10. Prove Dupins theorem. The coordinate surfaces of every triply orthog- 
onal curvilinear coordinate system in E; intersect along the lines of curvature 
of coordinate surfaces. Hint: Consider the surface x? = constant and take 
x! = u, x? = u? as surface coordinates on it. Show that along the coordinate 
lines ut = constant, u? = constant, by, = 0 if a, = 0. See Problem 4, Sec. 67. 


73. Parallel Surfaces 
Let S be a smooth surface defined by equations 
(73.1) yi=y(w,uw), (i = 1, 2, 3), 


where the coordinates y‘ are orthogonal cartesian. A surface S determined 
by equations 


(73.2) 7i(w, u?) = yw, u?) + hn'(u', u’), 


where n is the unit normal to S and A is the distance measured along the 
normal n, is called a parallel surface to S. 

Parallel surfaces figure prominently in the theory of elastic plates and 
shells, where relations connecting the Gaussian curvature K and the mean 
curvature H of S with the corresponding invariants for the surface $ 


are important. 


196 ‘ GEOMETRY [CHAP. 3 


We proceed to outline a derivation of such relations by recalling first 
that the base vectors a,, along the curves u, = constant, are related to 
the base vectors b, along the y’-axes by 
oy’ 
ðu” 


For simplicity in writing we introduce the notations (cf. Sec. 64) 


[64.10] E 


aut Ya TU Ya» 
and, on differentiating (73.2), we get 
(73.3) Ta = Ya + hn’, 
so that 
(73.4) Jan; = Yan; + hn? yn; 


But y,‘n; = 0, for the a, are orthogonal to n and n',n, = 0 since nin; = 1. 
Thus (73.4) reduces to 
(73.5) Tue = 
On the other hand, the unit normal ñ; to Š is orthogonal to the base 
vectors 7’, on 5S, so that 
(73.6) JĀ, = 0. 

We conclude from (73.5) and (73.6) that the vectors n; and ñ, are 
collinear, and, since they are unit vectors, n; = ñ;. 

The metric coefficients 4,, of S are given by”? 

Gap = gaiga 
which, on making use of (73.3), yield 
dap = (Ya Sr hn’ (yg le hn’s) 
= Ya Y+ hnigyg + hn'gy, + hni ni. 

The substitution in this expression from the Weingarten formula, 


[69.5] ni, = — a "bsy,', 
gives 
(73.7) äp Eag 2hbyp + h’niaN’ g, 


since Y, Yg = dg. 


*® See (64.6) and recall that g,; = 6,; since the coordinates y' are orthogonal cartesian. 


SEC: 73] PARALLEL SURFACES 197 


The last term in the right-hand member of (73.7) can be expressed in 
terms of the Gaussian curvature K and the mean curvature H as follows. 
On making use of (69.5) we get 


nin’ = a”b,sy,'a"beiYu : 
(73:8) NE Set oer ag 
=a "baaba 

since yy’, = a,,.. On the other hand, the Gauss equations 69.10 require 
that 


[69.10] Rupys = Daybgs — bysbpgys 


where 
Rapys = Ke, p€5 
by (62.5), so that 
Ke,p€y5 = Daybgs — basbgy 


We multiply both members of this relation by a%, sum on 6, and note 
that (cf. Sec. 62) 


ae s€,5 = — apy 
and find 
(73.9) — Kap, = ab, bpo — 2Hbpy, 
since 
[70.2] H = }a”b,s. 


The substitution in the first term in the right-hand member of 229) 
from (73.8) yields 


(73.10) nin p = — Kasp + 2H bap, 
and we can thus write (73.7) in the form 
alt). digg = Ayp(1 — h®K) — 2hb,ẹ(1 — hH). 


The important formula 73.11 enables us to compute the coefficients 
G,, at a given point P(u;, u2) on 5 from the values of a,,, bap K, and H at 


the corresponding point P(u, u) on S. , 
To compute the coefficients 5,, in the second fundamental form of S, 


we recall that 
bag = yi ghi 
by (67.7). But from (73.3), 
Jig = Yip F hn’ ps 
so that 
(73.12) Bag = Dap + hn apr 


198 GEOMETRY [CHAP. 3 


Since the coordinates y? are rectangular cartesian, nî = n; n'n’ = l, and 
we conclude that 

non — oe 
On differentiating this orthogonality relation we find that nin = 
—ni,n',, and hence (73.12) can be written as 


bap = bag — hn’ n'y. 
On making use of (73.10), we finally obtain 
(73.13) b.g = (1 — 2hH)bag + hKa,g. 


- Formulas 73.11 and 73.13 appear on pages 110-111 of a monograph 
by T. Y. Thomas, Concepts from Tensor Analysis and Differential Geometry, 
Academic Press (1961). They are closely related to formulas on page 272 
of L. P. Eisenhart’s Differential Geometry, Princeton Press (1940). 

Once the coefficients 4.,,,,, De are known, the Gaussian and mean curva- 
tures K and H can be computed from formulas 70.1 and 70.2. The result 
of somewhat lengthy computations, which will be found on pages 111 to 
113 of the mentioned monograph by T. Y. Thomas, is 


2 K 
| ee Sah 
(73.14) ves 
gH ik 
1 + k?èK —2hH 


From the first of these elegant formulas it follows that when S is a develop- 
able surface, then the parallel surfaces S are also developable. 


Problems 
1. If Sis a surface of revolution, show that a parallel surface Š is also a surface 
of revolution. Hint: Consider y! = u cos u*, y? = u! sin u?, u? = f(u'). 
2. Show that the principal radii R,„ of normal curvature of S are related to 
the principal radii RD of a parallel surface § by 


RA = liT h. 


74. The Gauss-Bonnet Theorem 


The description of surfaces with the aid of differential equations has 
local character, since relations among the derivatives describe properties 
of surfaces only in the neighborhood of a point. To obtain results valid 
for the entire surface one must perform integrations. Because of the 
complex structure of differential equations of the theory of surfaces, 
relatively few global results have been obtained, and the available results 


SEC. 74] THE GAUSS-BONNET THEOREM 199 


Fig. 31 


in global geometry are largely concerned with a special class of convex 
surfaces. There is one important classical result, however, that relates 
the integral of the Gaussian curvature evaluated over the area of an 
arbitrary smooth surface to the line integral of the geodesic curvature 
computed over the curve that bounds the area. Gauss viewed this result 
as the most elegant theorem of geometry of surfaces in the large. 

Let D be a region bounded by a closed piecewise smooth curve C 
drawn on a smooth surface S, shown in Fig. 31. We shall suppose that 
D is homeomorphic to a circular disk.?? We saw in Sec. 63 that a unit 
tangent vector 4* to a surface curve C is related to the unit vector 7° 
normal to A* by 
ya 
és 
where x, is the geodesic curvature of C. Moreover, if C is a geodesic, 
then x, = 0 at all points of C, and conversely. 

Since 7% is orthogonal to 4”, it follows from the concluding paragraph 
of Sec. 54 that €,97°A” = 1, or 


[63.4] 


HN» 


Na = a 
But from (63.4) 
x La 
g Na ôs >’ 
or 
6A” 
(74.1) žy = Egg? ae 
The integration of this expression over the curve C gives 
én” 
(74.2) [ ds = | cal! T ds, 


29 Two regions are said to be homeomorphic if they can be mapped into one another 
in a continuous one-to-one manner. 


200 GEOMETRY {CHAP. 3 


Fig. 32 


and when the line integral in the right-hand member of (74.2) is trans- 
formed into a surface integral by Green’s formula, we find that?’ 


(74.3) f x, ds = a K do + 2r — X (7 — a), 


where the «, are interior angles of the contour C shown in Fig. 31, and 


do = V'a du’ du? is the element of the surface area of D. If C is smooth, 
the sum > (7 — «,) = 0. 


Formula (74.3) embodies the statement of the Gauss-Bonnet theorem. 
Instead of deducing this theorem with the aid of Green’s formula, we give 
a geometric interpretation of formula (74.3) that suggests an alternative 
definition of Gaussian curvature. 

Consider first a sphere S of radius R and a spherical triangle P,P,P 3 
on S (Fig. 32) formed by the arcs P,P}, PoP3, PsP, of three great circles. 
Denote the interior angles of this triangle at the vertices P, by a; and cover 


3 The details of easy calculations are found in the following books: A. V. Pogorelov, 
Differential Geometry, P. Noordhoff, N.V. (1959), p. 161; L. P. Eisenhart, Differential 
Geometry, Princeton (1940), p. 191; D. J. Struik, Lectures on Classical Differential 
Geometry, Addison-Wesley (1950), p. 154. 


SEC. 74] THE GAUSS-BONNET THEOREM 201 


the sphere by some coordinate net (u, u). If the base vector along the 
u,-coordinate line at P, is a, and A(P,) is an arbitrary surface vector at 
P,, we denote the angle between a, and A(P;) by g. Let 0 be the angle 
made by A(P,) with the geodesic arc P,P}. When A(P;) is pfopagated 
in a parallel manner along the geodesic triangle P,P,P3, it assumes the 
position A’(P,) shown in Fig. 32. Our immediate object is to determine 
the angle g’ between a, and A’(P)). 

During the parallel propagation of A(P,) along P,P,, the angle 0 is 
unchanged (see Sec. 60), and the vector A assumes the position A(Ps:) 
with the geodesic arc P,Ps, then 


B = 7 — (a, + 8). 
In the course of parallel propagation of A(P,) along P,P; the vector A 
continues making angle 6 with P,P, and, on reaching the point Ps, it 
assumes the position A(P;). Let y be the angle between A(P3) and the 
arc P,P, then 
y = as — Ê =a; — Îr — (%: + I) = x2 +%+6—7. 
On continuing propagations of A along P3P,, the vector A maintains 
the angle y with P,P, until it reaches the point P, when it assumes the 
position A’(P,). Now, the angle g’ made by A’(P,) with a, is 
giy thein 

e + e ar e a a 7, 
so that the angle m’ — p between A(P’) and A(P) is 
(74.4) - P — p= % + t + Os — 7. 
The change y’ — g representing the difference between the sum of interior 
angles of the spherical triangle P,P,P,; and the sum of interior angles of 
the rectilinear triangle is called the spherical excess of the spherical 
triangle P,P,P3. If instead of interior angles «, we introduce the exterior 
angles 0; = 7 — &; formula 74.4 reads 


p' — p= 24 — > b, 
Tn 


When the vector A is propagated along a geodesic polygon of n sides, 
entirely similar computations yield for the spherical excess of the polygon?! 


y — p = $u — (n — 2)r, 
or 
gy — p = 2r — Qh, 
i=1 


$1 Note that the sum of interior angles of a rectilinear polygon of n sides is (n — 2)7 
radians. 


202 GEOMETRY [CHAP. 3 


if we use the exterior angles 0; = 7 — «;. But it is known from spherical 
trigonometry that the spherical excess of a geodesic polygon is equal to 
o/R?, where ø is the area of the polygon and R is the radius of the sphere. 
Thus 


gy —p=2r-— 0 = 


and since the Gaussian curvature K for the sphere is equal to 1/R?, we 
can write 
27r — È 0; 

P : 


(74.5) K= 


This formula can be generalized to obtain the Gauss-Bonnet formula 
74.3 for the case where C in Fig. 31 is a geodesic polygon. Thus, if the 
region D is subdivided by small geodesic polygons into subregions of 
areas do, the familiar procedures of integral calculus applied to (74.5) 
yield 


(74.6) f| x do = 27-0, 
D t=1 


This formula coincides with (74.3), since x, = 0 when C is a geodesic 
polygon. 

‘Formula 74.6, first obtained by Gauss, was generalized by Bonnet* to 
yield the result (74.3), which, as we have already noted follows directly 
from (74.2) on application of Green’s formula. 


The left-hand member {| K do in (74.6) is called the integral curvature 
D 


of D. It turns out that the integral curvature is a topological invariant. 
Two surfaces are said to be topologically equivalent if they can be mapped 
into one another by a continuous one-to-one transformation. It can be 


shown by using (74.3) that ffx do = 4r for all regular surfaces 
D 
topologically equivalent to a sphere and {| Kdo = 0 for all regular 
D 


surfaces topologically equivalent to a torus.** 


75. The n-Dimensional Manifolds 


It is the purpose of this section to introduce a few concepts from the 
geometry of n-dimensional metric manifolds which are of interest in 
applications to dynamics and relativity. Many of these concepts are 
straightforward generalizations of ideas introduced in this chapter in 


** O. Bonnet, Journal école polytechnique, 19 (1848), pp. 1-146. 
33 See D. J. Struik, op. cit., pp. 153-159. 


SEC. 75] THE n-DIMENSIONAL MANIFOLDS 203 


connection with the study of surfaces imbedded in the three-dimensional 
Euclidean manifolds. 

We shall suppose that the element of distance between two neighboring 
points in an n-dimensional manifold is given by the quadratic form 


ol) ds = pee de’, (j=l, n), with [2,,| 0. 


We extend the definition of Euclidean space, given in Sec. 29, by saying 
that the space is Euclidean if there exists a transformation of coordinates 
x? such that the transform of ds? is a quadratic form with constant 
coefficients. Since every real quadratic form with constant coefficients 
can be reduced by a real linear transformation to the form 


(75.2) ds? = A(dx')’, (A; = +1), 


the form 75.2 can be used to define an Euclidean n-dimensional manifold. 
If, in particular, the form 75.2 is definite, we shall say that the manifold 
is purely Euclidean, but if it is indefinite, the manifold will be called 
pseudo- Euclidean. 
A linear manifold determined by a set of n equations 


C: x= x(t), LS 


with suitable differentiability properties, will be said to define a curve Cin 
an n-dimensional manifold. 
If the form 75.1 is positive definite, we shall say that the positive number 


= | Jg (drJd(dz' idi) dt 


is the length of the curve C. There are definitions of metric manifolds 
which are not based on the expression for the element of arc in the form 
75.1, but they need not concern us here (see Sec. 43). 

The vector A‘ = dx‘/ds will be said to define the direction of the curve, 
and it is clear that g,,A‘/’ = 1, so that the vector 2‘ is a unit vector. The 
length of any vector A* is given by the formula 


A = WV g;;A'Ai 


The notion of the angular metric in an n-dimensional manifold is a 
direct generalization of the definition of the angle in the three-dimensional 


case. 
If A‘ and u’ are two unit vectors, we define the cosine of the angle 


between them by the formula 


(75.3) cos 6 = gA pi. 


204 GEOMETRY [CHAP. 3 


It is not clear from this definition that the angle 6 is necessarily real. We 
shall prove, however, that this is always so if the form g,; daldaiods 
positive definite. The proof follows at once from the Cauchy-Schwarz 
inequality 


(75.4) (gut N (gy2 t (gy Y 4); 


where the form g,,x*x’ > 0. 

We first establish the inequality 75.4. Let the form Q(x) = g,,;x'x’ be 
positive definite. If we replace in it x* by x‘ + Ay’, where A is an arbitrary 
scalar, we obtain 


Ole + Ay) = gia" + ly Na + Ay’) 
= poe + Dona yA + VE 
= Q(x) + 2Q(z, y)A + QP. 
This is a quadratic expression in A with real coefficients. By hypothesis 


Q(z + Ay) > 0, the sign of equality holding if, and only if, x* + Ay* = 0. 
Hence, the equation in A, 


JO) = OY)? + 20, y)A + O(x) = 0, 


possesses no distinct real roots. But a necessary and sufficient condition 
that this be true is that [Q(x, y)? — O(y)Q(z) < 0, that is, 


(gury < (ET ‘x Nguy Y i). 


This is precisely the inequality 75.4. 
If, now, in formula 75.4 we set x* = A‘ and y* = u‘, we get 


(guh u y 
(gA A eiu) 
and, since A’ and u’ are unit vectors, we have (g,,A*u*)? < 1, which 
states that the angle 0 in formula 75.3 is real. 
We define the volume element in R, by the formula 


dr = Vg dx! dg de", 


and the volume by the corresponding n-tuple integral. 

A generalization of the concepts of curvature and torsion to curves 
imbedded in the n-dimensional Riemannian manifolds is direct and 
straightforward, but matters become rapidly involved when one comes 
to consider hypersurfaces. 


*4 See, for example, J. C. H. Gerretsen, Lectures on Tensor Calculus and Differential 
Geometry, P. Noordhoff, N.V. (1962), Chapter 6. 


Sec. 75] THE n-DIMENSIONAL MANIFOLDS 205 


A set of n equations 
ee — ete... . ™), (Pal een), m<n, 


is said to define an admissibly parametrized m-dimensional variety (or a 
hypersurface) over a neighborhood of the variables u* if (a) the x’ in (75.5) 
are of class C? and (b) the Jacobian matrix (@x'/du"), (x = 1,2,...,m), 1s 
of rank m at each point of the neighborhood. 

In Secs. 64 to 73 we have studied two-dimensional Riemannian varieties 
(surfaces) imbedded in E, A question naturally arises: Under what 
circumstances an m-dimensional variety Rm, with a Riemannian metric 


m? 


(75.6) ds? = agg du* du’, C a). 
can be imbedded in the n-dimensional Euclidean manifold with 
(75.7) ds = dx’ dz‘? 


Now equations 75.5 together with 75.6 and 75.7 require that 
Ox‘ dx 


75.8 = — —, 
( ) aes Ou ou? 


eho eee). 

The set of 4m(m + 1) partial differential equations 75.8 in n variables a 
will not be expected to possess a solution unless n > 3m(m + 1); thus, 
if m= 2,n > 3; if m = 3,n > 6, and so on. This estimate, however, 
does not constitute a proof that an m-dimensional variety can be imbedded 
in E, whenever n > 3m(m + 1). It is possible, however, to prove that a 
neighborhood of R,, can be imbedded in E,, if n > 4m(m + 1). Con- 
cerning the global imbedding of the whole of R,, in E,, almost no general 
results are known. There are a few special theorems on the imbedding 
of two-dimensional varieties with special topological properties. These 
refer almost wholly to convex two-dimensional manifolds.” The problems 
on imbedding now lie at the forefront of researches in geometry. 


85 See L. Nirenberg, “The Weyl and Minkowski Problem,” Comm. on Pure and Appl. 
Math., 6, 1948, and A. V. Pogorelov, Some Questions of Geometry in the Large in a 
Riemannian space, Moscow (1957). 


4 


ANALYTICAL MECHANICS 


76. Basic Concepts. Kinematics 


Analytical mechanics is concerned with a mathematical description of 
motion of material bodies subjected to the action of forces. Its develop- 
ment follows a familiar pattern. A material body is assumed to consist of 
a large number of minute bits of matter connected in some way with one 
another. The attention is first focused on a single particle, which is 
assumed to be free of constraints, and its behavior is analyzed when it is 
subjected to the action of external forces. The resulting body of knowledge 
constitutes the mechanics of a particle. To pass from mechanics of a single 
particle to mechanics of aggregates of particles composing a material body, 
we introduce the principle of superposition of effects and make specific 
assumptions concerning the nature of constraining forces, depending on 
whether the body under consideration is rigid, elastic, plastic, fluid, and so 
on. 

We begin our study of mechanics of continua by analyzing the motion of 
a single particle. The particle is assumed to be an idealized entity having 
position and inertia, but no spatial extension. The measure of inertia is 
mass, and thus the particle is simply a point-mass. Another basic ingredient 
of mechanics is the concept of time, which arises in the assumption of 
causal connection between physical events. The hypothesis of causality 
implies the possibility of ordering events, and the time f, as it appears in the 
description of the physical universe, is an independent parameter whose 
range of variation is the real-number continuum. 

We will suppose that physical events take place in the three-dimensional 
space whose metric is Euclidean, and we refer the position of a particle 
at a given time ¢ to some curvilinear reference frame X. As in the study 
of geometry in Chapter 3, we denote the coordinates of the particle relative 
to a set of orthogonal cartesian axes by the symbols y’. Clearly, the 
position of a particle is a relative concept depending on the selection of a 
reference frame. The reference system generally used in astronomy is that 

206 


SEC. 77] NEWTONIAN LAWS. DYNAMICS 207 


determined by the so-called fixed stars. It is termed the primary inertial 
system. Any system of axes moving relative to the primary inertial system 
with constant translational velocity is called a secondary inertial system. In 
many mechanical problems the motion of the earth relative to the primary 
inertial system is so nearly negligible that the Newtonian laws (Sec. 77), 
which are assumed to be valid only in the inertial reference frames, can be 
applied without modification to study the motion 
of particles referred to a system of axes fixed in 
the earth. 

When a particle changes its position in a given 
reference frame, it is said to undergo a displace- 
ment. Thus, suppose that the particle is at the 
point P, at time ¢. Its position at this time is 
given by the vector r,; at a later instant of time 
t+ At it is at Pa, determined by the position 
vector Tą» We denote the displacement. in the 


interval of time Ar by the vector P,P, = Ar 
(Fig. 33) and suppose that the particle traverses 
a continuous path which is represented by the Fig. 33 

vector sum of the elementary displacements dr. 

We define the average velocity of the particle during the displacement Ar 
by the formula v,,. = Ar/Ar, and we assume that this ratio has a unique 
limit as At > 0. Then the instantaneous velocity v is given by the formula 
drjdt = t = v. Velocity v is, of course, a vector. 

The case in which dr/dt is constant is of relatively minor interest in 
mechanics, and generally we will be concerned with accelerated motions. 
We define the average acceleration of the particle, during the time interval! 
At, by the formula a,,, = Av/At, and the instantaneous acceleration by 


Hereafter, unless otherwise specified, the words velocity and acceleration 
are taken to mean the instantaneous values. 

The velocity and acceleration are known as the kinematical concepts of 
mechanics to distinguish them from those concepts that utilize the idea 
of force. We consider this idea in the following section. 


77. Newtonian Laws. Dynamics 


In 1687, Sir Isaac Newton published three axioms or laws, the first of 
which was based on deductions from a set of remarkable experiments 
performed by Galileo (1564-1642) on bodies moving on inclined planes, 


208 ANALYTICAL MECHANICS [CHAP. 4 


and the other two represent a profound crystallization of the notions 
surrounding these experiments. These laws form the point of departure 
in all considerations in dynamics, and we give them here in a form that is 
almost a literal translation of Newton’s Latin as it appears in the 1726 
edition of the Philosophia Naturalis Principia Mathematica. The present- 
day formulation of analytical mechanics is essentially due to J. L. Lagrange 
(1736-1813), whose greatest work, Mécanique analytique, was written in 
1788, and W. R. Hamilton (1805-1865), whose celebrated principle em- 
braces the whole of mechanics. 

NEWTONIAN Laws. I. Every body continues in its state of rest, or of 
uniform motion in a straight line, except insofar as it is compelled by im- 
pressed forces to change that state. 

II. The change of motion is proportional to the impressed motive force, 
and takes place in the direction of the straight line in which that force is 
impressed. 

‘III. To every action there is always an equal and contrary reaction; or 
the mutual actions of two bodies are always equal and oppositely directed 
along the same straight line. 

The first law depends for its meaning on the dynamical concept of force 
and on the kinematical idea of uniform rectilinear motion. It ascribes 
anthropomorphic attributes to a particle, which is bent on continuing its 
motion in a straight line but is somehow deflected from its intentions by a 
push or pull. Newton doubtless felt that the idea of force is intuitively 
known and requires no further explanation. We shall presently see that 
the first law is in reality a corollary of the second. 

The second law of motion also introduces the kinematical concept of 
motion and the dynamical idea of force. To understand its meaning it 
should be noted that Newton uses the term motion in the sense of momen- 
tum, that is, the product of mass by velocity. Thus the “change of motion” 
means the time rate of change of momentum, and hence in vector notation 
the second law can be stated as a formula 


(77.1) | <i Fees, 
dt 
provided that our units are so chosen as to make the proportionality 
constant equal to one. 
If we postulate the invariance of mass, then equation 77.1 can be 
written in the familiar form 
(77.2) F = ma. 


We note from (77.1) that, if F = 0, then d(mv)/dt = 0, so that mv = 
constant, and hence v is a constant vector. Thus the first law is a con- 
sequence of the second. 


SEC. 78] EQUATIONS OF MOTION OF A PARTICLE 209 


The concept of mass can obviously be defined with the aid of the second 
law in terms of force and acceleration. There were numerous attempts to 
define mass and force independently of one another. The most familiar 
of such definitions is due to Ernst Mach,! who formulated a définition of 
mass with the aid of Newton’s third law of motion. In our opinion a fine- 
grained analysis of Mach’s definition of mass reveals certain logical 
difficulties which cannot be resolved by appealing to the third law alone. 
For this reason it seems best to leave one of the fundamental building 
blocks of mechanics (mass or force) undefined and admit it in the science 
of mechanics on the same basis as the “God-given integers” in mathe- 
matics. 

The third law of motion states that accelerations always occur in pairs. 
In terms of force we may say that, if a force acts on a given body, the body 
itself exerts an equal and oppositely directed force on some other body. 
Newton called the two aspects of the force action and reaction, whence the 
usual statements of the law. 

The entity of mass entering in the formulation of Newtonian laws is 
sometimes called the inertial mass (or simply inertia) to distinguish it from 
the gravitational mass M entering in the Newtonian law of gravitation. 
This law states that the force of attraction between a pair of particles is 
proportional to the product of their masses, is inversely proportional to 
the square of the distance r between them, and is directed along the line 
joining the particles. In symbols, 
(77.3) p= k er. 

r 
where k is a universal constant and r is a vector directed from mass M, to 
mass M». 

If it is assumed (as it is usually done) that the gravitational and inertial 
masses are equal, the law 77.3 furnishes a practical means for comparing 
masses with the aid of beam balances. 

In order to develop the science of mechanics of a universe consisting of 
more than two particles, it is necessary to adjoin to Newtonian laws the 
principle of superposition of effects and make further assumptions re- 
garding the nature of constraints. 


78. Equations of Motion of a Particle. Work. Energy 


Let the position of a moving particle P be determined by a vector T. 
If the curvilinear coordinates of the terminal point of r are denoted by 


1 E. Mach, The Science of Mechanics. An interesting survey of it is contained in 
R. B. Lindsay and H. Margenau, Foundations of Physics. 


210 ANALYTICAL MECHANICS [CHAP. 4 


a(t), then the equations of the path C of the particle can be written in the 
form 


(78.1) C: x = x(t), 


and we call the curve C the trajectory of the particle. 
The velocity of P is a vector v == dr/dt, whose components are 


(78.2) fa. 


The acceleration a = dv/dt = d’r/dt? has the components (see Secs. 46 
and 47) 


(78.3) d = 


3 


it de (| det a 
ôt dt? jk) dt dt 


where 6v'/6t is the intrinsic derivative and the (i are the Christoffel 


symbols calculated from the metric tensor g,;, associated with the reference 
system X. 
If the mass of P is m, Newton’s second law of motion yields the equation 
F = md’r/dt?, or 
i 


r dv 7 
78.4 F* = m — = ma’. 
(78.4) E 


In orthogonal cartesian coordinates, equation 78.4 assumes the familiar 
form F’ = m d?y'/dt?. 

We introduce next the concept of energy, which will permit us to give 
a more elegant formulation of the theory. The germ of the energy concept 
can be traced back at least to Galileo, who remarked, “What is gained in 
power is lost in speed,” but the first clear introduction of the idea of energy 
in mechanics as a quantity equal to the product of mass and the square of 
velocity of the particle (vis viva) was made by Huygens in the seventeenth 
century. The full use of this idea, however, and of its relation to the con- 
cept of work, did not come until the nineteenth century. 

We define the element of work done by the force F in producing a dis- 
placement dr by the invariant dW = F - dr, and, since the components of 


F and dr are, respectively, F’ and dz’, this scalar product is equal to 

= o. F° dz 
(78.5) M 
= F; dz’, 


where the F; = g,;F’ are the covariant components of the vector F. We 
shall suppose that, in general, the functions F (x), defining the vector field 


SEc. 78] EQUATIONS OF MOTION OF A PARTICLE 2l! 
F, belong to class C1. The work done in displacing a particle along the 
trajectory C, joining a pair of points P, and Py, is the line integral 


P2 
(78.6) W=| Fd. : 


Py 
Making use of Newton’s second law of motion 78.4, we can write 78.6 
in the form 


Pe i 

: Olen: 

w=| mg; — dz’ 
Pi o ôt 


TET ; 
= = f"mg ce v’ dt 
a OF l 
But 0(g;;0"v’) =o ou" a 
ôt aon 


and, since g;;v'v is an invariant, 


lgi v) d ini 
—2 m abou) 
g e” ) 
and hence 
d e év' ; 
—(f,00') = 22,;-— 0". 
FAC ) Eai 
Inserting from this result in the integrand of (78.7) yields 
j tem d te 
78.8) We 22g pdt 
( sae”? 
. P2 
= giv v’ 
ar 
= T a ie 
where 
2 
m : mv 
Te 2 


We have the result that the work done by the force F; in displacing the 
particle from the point P, to the point P, is equal to the difference of the 
values of the quantity T = mv? at the end and at the beginning of 
the displacement. We define the quantity T = }mv?, which is exactly one- 
half of the vis viva of Huygens, as the kinetic energy of the particle. 

The statement embodied in the formula 78.8 can be enunciated as a 

THEOREM. The work done in displacing a particle along its trajectory 
is equal to the change in the kinetic energy of the particle. 


212 ANALYTICAL MECHANICS [CHAP. 4 


It may happen that the force field F; is such that the integral 78.6 is 
independent of the path. In this event the integrand F; dx’ is an exact 
differential, 


(78.9) dW = F, dx’, 


of the work function W. The negative of the work function W is called the 
force potential or potential energy. We denote the potential energy by the 
symbol V, and conclude from (78.9) that 


(78.10) F,=—-—. 


The fields of force for which potential functions exist are called conservative. 
There is a simple criterion for a field of force F; to be conservative. We 
state it as a 

‘THEOREM. A necessary and sufficient condition that a force field F,, 
defined in a simply connected region, be conservative is that F; ; = F;.;. 

The proof of this theorem follows immediately from the observation 
that a necessary and sufficient condition for the expression F; dx‘ to be an 
exact differential of a single-valued function V is that 


(78.11) — =-=, 


ij 
; k\ . — 
and, since is is symmetric in į and j, we conclude that the condition 


78.11 is completely equivalent to the one stated in the theorem. 

As a corollary we observe that a parallel force field (Sec. 48) is necessarily 
conservative, since the condition for a vector field F; to be parallel is 
F,, = 0. 


79. Lagrangean Equations of Motion 


An alternative formulation of the Newtonian law 78.4, phrased in terms 
of the kinetic energy of the particle, was obtained by Lagrange from the 
principle discussed in Sec. 84. We derive these equations in this section by 
a direct calculation which makes use of Newton’s second law of motion. 


a jii the region is multiply connected, the conditions 78.11 still guarantee the existence 
of potential V related to F; by formula 78.10, but, in this case, the function V, in 
general, is multiple valued. 


Sec. 79] LAGRANGEAN EQUATIONS OF MOTION 213 


The kinetic energy T = }mmv? can be written as 
(79.1) = = gid’, 


since # =v’. If we differentiate (79.1) with respect to x we obtain 
d7/0i' = mg,’ The derivative of this expression with respect to f is 


4 (27) = m( gis J gij ey’). 


dt \dz' ox* 
If we subtract from this the derivative of (79.1) with respect to 2’, namely, 
OT moog, ., 
Se 
or 2 Ozx' à 
we get 


a) -i | i 1 (28u Gu _ 282) arat 
== m| g,,0 + ae at ay 


m{g,;4) + Lik, ilti") 


mga + | i |i). 
jk 


But by (78.3) the expression in parentheses on the right is the acceleration 
a', and, since mg;,a' = ma; = F,, we can write 


d (Z) oT 
192 —|—)- F 
a?) dt \dx* Ox" 
Equations 79.2 give the statement of Newton’s second law in the form used 
by Lagrange. 


For a conservative system, F; = —dV/dx', and equations 79.2 becomes 
aan aoe 
> dt\aa'} ax" Ox!” 
or 
d 2T) o(T — V) 
—{— | — ——,_ = 0. 
4) ate Ox" 


We recall that the potential energy V is a function of the coordinates 
x alone; hence, if we introduce the Lagrangean function 


ae T= , 


we can write equation 79.4 in the form 


i d [ðL oL 
ee). Saag. 
> a(z) BEN 


214 ANALYTICAL MECHANICS [CHaP. 4 


In the application of Lagrangean equations to specific problems one 
frequently deals with the physical components F’ of the force vector F 
instead of the tensor components F’. The physical components of F, we 
recall, are the coefficients in the representation 


b= i e 


where the e,’s are unit vectors codirectional with the base vectors a,. 
(See Sec. 45.) Since F = F'a, and a, a; = g,;, the physical components 
F’ are related to the tensor components F’ by the formula 


Fi=/g,,F', (no sum). 
Problems 


1. Show that the covariant components of the acceleration vector in a spherical 
coordinate system with ds? = (dx1)? + (x! dx?)? + (x1)? sin? x(dzx*)? are 


a = 2 E r we sina), 


d 
a, = = EYZ] — Gy sin et cos st G), 


d 
Oh [(z" sin x*)7a:5]. 
Deduce these expressions from formula 78.3 and also from Lagrangean equations 


out mids’ _ Mi i; 
1927 Hint: F; = ma; and T = -5 = Ti? 

2. Use Lagrangean equations to show that, if a particle is not subjected to the 
action of forces, then its trajectory is given by y? = a't + bř, where the a‘ and 
b* are constants and the y? are orthogonal cartesian coordinates. 

3. Find, with the aid of Lagrangean equations, the trajectory of a particle 
moving in a uniform gravitational field. Hint: T = }my'y? and V = mgy, 
where y is normal to the plane of the earth. 

4. Deduce from Newtonian equations the equation of energy, T + V = h, 
where A is a constant. Hint: Show that dT/dt = majvi = —dV/dt. 

5. Prove that, if a particle moves so that its velocity is constant in magnitude, 
then its acceleration vector is either orthogonal to the velocity vector, or it is 
zero. Hint: Compute the intrinsic derivatives of v? = g,;v'v’. 


d oT oT 
6. We have shown in Sec. 79 that — — — — 
dt ðt! da? 


whenever T(x, #) is an invariant defined by (79.1). Prove more generally that if 


me : - d aw ow 
W(x, ż) is an invariant, then both dW/0ez* and — —~- — —— are covariant 
dt ðt! Ox? 
vectors. Hint: Let xê = x(q!,q?,q°) be an admissible transformation of 
coordinates. Compute <*, show that 0%'/@g’ = @r*/ag’, and observe that 


the invariance of W(x, <) requires that W(x, #) = W[2x(q), #(q)] = Wig, 9). 


is a covariant vector F; 


Sec. 80] APPLICATIONS OF LAGRANGEAN EQUATIONS 215 


80. Applications of Lagrangean Equations 


As an illustration of the application of Lagrangean equations to the 
determination of trajectories, we consider several examples, which include 
the important cases of particles moving on smooth curves and surfaces. 

1. Free-Moving Particle. If a particle is not subjected to the action of 
forces, the right-hand member of equation 79.2 vanishes, and we have 


d 5 OT 
80.1 SES = a —— (0) 
( ) ae or’ 


If the coordinates xê are chosen to be rectangular cartesian, then T = 
(m/2)y‘y', and hence equation 80.1 yields my’ = 0. Integration of this 
equation gives yt = a't + b‘, which represents a straight line. 

2. Constant Gravitational Field. Again we choose a cartesian reference 
frame and take the Y%-axis to be normal to the plane of the earth. The 
potential V of the constant gravitational field is V = mgy?, if the positive 
Y3-axis is directed upward. In this case equations 79.2 give 


lo 0, a 0, XP = —g, 
so that the trajectory is determined by 


y* =at + b, (a= 1,2), 
y? = —lg? + at +b. 


Thus the trajectory is a parabola whose axis is parallel to the Y*-axis. 
3. Motion of a Particle on a Curve. Let a particle be constrained to 
move on a curve C whose equations are 


(80.2) ces), =L, 


s being the arc parameter. We shall suppose that C has a continuously 
turning tangent, so that the «’(s) are of class C?. 

The components vt of the velocity vector v of the particle are 

, dz’ dads 

80.3 a a a E 
Saale) dt ds dt 
where 2‘ = dz'/ds is the unit tangent vector to C and v = ds/dt is the 
magnitude of v. 

The components a‘ of the acceleration vector a are determined by 
computing the intrinsic derivative of (80.3) with respect to ¢ 

dv a 


#—- —jZ7i+0—, 
ad cat iw Bt 


vii, 


216 ANALYTICAL MECHANICS [CHAP. 4 


where we wrote ôv/ôt = dv/dt, since v is a scalar. However, 


oA oA ds bi 7 
(80.5) Ce 6 2.) See 


where we recalled the Frenet formula 


62 i 

[50.1] — = x, 
ôs 

defining the curvature x and the principal normal unit vector p. 


On substituting from (80.5) in (80.4), we get 


x > 0, 


; din Da 
80.6 t = — A+ w, 
(80.6) don u 


which states that the acceleration vector a lies in the osculating plane of the 
curve. Moreover, the component of a in the tangential direction is equal 
to the time rate of change of speed v, whereas the component in the 
direction of the principal normal is v?/R, where R = 1/x is the radius of 
curvature of C. 

The force F = ma acting on a particle of mass m moving along C is 
determined by 


(80.7) Fi'=m =y + mou. 


It should be observed that F’ is the resultant of all external forces that act 
on the particle and thus F includes the reaction R of the curve on the 
particle. Since F lies in the osculating plane of the curve, the component 
of all external forces normal to this plane is zero. This condition enables 
us to compute reaction R in the general case. In mechanics the curve C is 
said to be smooth if the reaction R is normal to C, that is, if R‘A; = 0: 
If R = 0 the curve C is called the natural trajectory of the particle. 

As an illustration, let a bead of mass m slide under gravity along a 
smooth curve C lying in the vertical Y! Y?-plane (Fig. 34). 

The force F acting on m is 

F=mg+R, 


where R is the pressure exerted by the curve on the particle and mg is the 
gravitational force. Since the curve is smooth, R is normal to C. If « is 
the angle between the direction of R and the positive Y?-axis, the com- 
ponents of F in the directions of the tangent A and the principal normal 
p are 


Fa = —mgsina, Fy) = —mgcosa + R. 


* This is equivalent to saying that the frictional force is zero. The term “smooth” 
employed in mechanics is different from the term “smooth” used in geometry, where a 
“smooth curve” is one with a continuously turning tangent. 


Sec. 80] APPLICATIONS OF LAGRANGEAN EQUATIONS 217 


-> Y! 
Fig. 34 
On referring to (80.7), we conclude that 
(80.8) m 2 = —mgsin«, mxo = —mg cosa + R. 


But cos a = dy'/ds, sina = dy?/ds, and dv/dt = (dv|ds)(ds|dt). Accord- 
ingly, the first of equations 80.8 yields 


dv dy’ 
mv — = —mg—, 
ds ds 
so that 
(80.9) 4m2 = —mgy* + constant. 


Since in this case the component of R in the direction of the path is zero, 
we could have written equations 80.9 directly from the energy equation 
T + V = constant. 

Equation 80.9 determines the speed v along C as a function of y?. The 
second equation in (80.8) then serves to determine R as a function of the 
curvature x. If the curve is rough, R is no longer normal to C and 
the angle « depends on the coefficient of friction. 

As a concrete example, let a particle of mass m move under gravity 
along a smooth cycloid 


y! = a(0 — sin 0) 
o y? = a(l + cos 9), 0<9< 27, 


218 ANALYTICAL MECHANICS [CHAP. 4 
shown in Fig. 35. Then the first of equations 80.8 yields 


(b) m- = — Nias 


where 


Dao ~ G o ae 
s = | Vay ay) = af /2(1 — cos 0) d0 


0 
= 2a | sin 2 a0 = 4a( — cos). 
0 2 2 


Since cos? (6/2) = 4(1 + cos 0) we deduce, on noting the second of 
equations (a), that 

e _ (s — 4a)’ 
‘ 8a 
Accordingly, (b) yields the equation 


= & 
+ = şs = Z, 
3 4a 
the general solution of which is 


(c) s = c cos (Vg/4at + cs) + 4a. 


The integration constants c, and c, are determined by the initial position 
and initial velocity of m on the cycloid. 

It is clear from (c) that the period of motion is igmmaod of the 
amplitude c, and is equal to 27y gl4a gl|4a. This fact was discovered by 
Christian Huygens, about 300 years ago. Huygens proposed the use of 
cycloidal pendulum in the construction of isochronous clocks. Calcu- 
lations, making use of the second equation 80.8 show that R = 2 mg cos a. 


Fig. 35 


Sec. 80) APPLICATIONS OF LAGRANGEAN EQUATIONS 219 


Problems 


1. Deduce the differential equations for a simple pendulum of length /, and 
show that for small oscillations the period is 27/ V gil. 


2. Derive the equations of motion for a particle moving under gravity on a 
smooth helix: 


y! =acos#, y*=asiné, y4? = kð. 


Note that since the helix is smooth, the reaction R is normal to the helix and 
hence the component of the resultant force F in the tangential direction is equal 
to the component of the gravitational force mg in that direction. The latter 
component can be computed from gravitational potential V = mgy?. Further- 
more, by (80.9) the energy equation in this case yields dm(v? — v) = 
mg(y® — Y°). 

3. If a particle of mass m moves on a smooth parabola with its axis vertical 
and concavity downwards, show that the reaction R = xm(v? — v’”), where v’ 
is the velocity of m for which this parabola is the natural trajectory and v is the 
velocity in constrained motion. 


4. Motion of a Particle on a Surface. Let the equations of a regular 
surface S be given in a parametric form as 


(80.10) S: xi = x(w, u), (= 1,2, 3), 


and let a particle of mass m be constrained to move on S under the action 
of the force F. The force F is the resultant of all external forces acting on 
the particle and thus includes the reaction R of the surface on the particle. 
When the surface is smooth, R is normal to S and represents pressure that 
constrains the particle to remain on S. 

The space components v’ of the velocity vector v of the particle are 
related to the surface components v* by the formula* 


pe Se = rii, (« = 1, 2), 
or 


(80.11) oof = 0", 


where v* = ú”. 
The acceleration a’ = 6v‘/5t; hence equation 80.11 yields 


4 See equation 64.5. The reader should take care not to confuse the base vectors a* 
used in Chapter 3 with the acceleration components a® used in this section. 


220 ANALYTICAL MECHANICS [CHAP. 4 
or 
(80.12) a = 2,'a" folie’, 


where a* = 6v*/6t. 
If we make use of the Gauss formula 


[67.7] E g bgi 
equation 80.12 reads 
(80.13) ai = sa + bgv*v'n’. 
Thus 
a’ = zia" + bapt AAP n', 
and, since the normal curvature xın) = b,,4*A", we have 
a’ = ra" + vy". 
Since F? = ma’, we have 
(80.14) F? = mz,‘a* + mv,)n". 


The first term in the right-hand member of (80.14) is the component of 
F in the tangent plane to S, whereas the second is the component of F 
along the normal n. For, the component of F in the direction of the 
normal n is 


(80.15) F'n, = mz,'n,a* + mv?x;,)n'n;, 
2 
= 0 + mv Xin) 


since the surface vectors x,’ are orthogonal to n; and n'n; = 1. The com- 
ponents of F in the plane tangent to S, on the other hand, are given by 


ipi reek ioe 2 jni 
Eit, F' = MgX, 2a a" + Mv Hn) Eit, N 
aed a 
= ma,,a* + 0, 


since gv £ = a,, by (64.6), and g;,x,/n' =0, because the surface 


vectors x,’ are orthogonal to n,;. If we rewrite this relation as 
a2, iF, = ma,, 

and set F, = «,’F;, we obtain a pair of Newtonian equations 

(80.16) F, = ma,, 


relating the surface force vector F, to the surface acceleration vector a. 


Sec. 80] APPLICATIONS OF LAGRANGEAN EQUATIONS 221 
Equations 80.16 can be recast into equivalent Lagrangean form by 
noting that the kinetic energy T = mv? is 


m m B 
= — a vv? = — au’. 2 
2 aß 2 ap 


We obtain, as in Sec. 79, 
d (27) T 

dt\ðú" ðu“ e 
where F, is defined by (80.16). When the force field is conservative, 
F, = —0V|ðu7, where V is the potential. 

We can deduce the equation analogous to (80.6) for the acceleration 
along the trajectory of the particle moving on S. The velocity v* of the 
particle, along the trajectory, is v* = vå*, hence 


i ey 


(80.17) 


ôt dt ôt 
dv OAS 
= 72 a Vee 
dt Os 
If we recall that 
[71.6] Le 
bs 


where 7% is the unit normal to the trajectory in the tangent plane, and x, 
is the geodesic curvature, we can write 


dv 2 
at = — At + 0°, 7" 
dt a 


so that 


If follows from this result (cf. equation 80.7) that 
F” = aT ya + 2Tx,7", 
ds 


where T = mv?/2. If the vector F” vanishes identically, then dT/ds = 0 
and x, = 0 along the trajectory. The first of these equations states that 
v = constant, and, if v # 0, then the trajectory isa geodesic by the theorem 
of Sec. 63. 


222 ANALYTICAL MECHANICS [CHAP. 4 


y? 


Fig. 36 


As an illustration of the use of equations 80.17, we consider a particle of 
mass m constrained to move under gravity on a smooth paraboloid of 
revolution (Fig. 36), 


(80.18) y? = = (y9 + (y*)*], a= constant. 


If we introduce cylindrical coordinates (r, 0, 2) by setting 
y! = r cos 9, =F sin e =, 
equation 80.18 becomes 


(80.19) z = —, 


and the kinetic energy T = 4my'y' takes the form 


m pe i 
ral bere 


The potential energy of the gravitational field is V = mgy?, which, on 
noting (80.19), takes the form V = mgr*/4a. Since the surface is smooth, 
the reaction R is normal to S and we can use equations 80.17 with F, = 
—0V/du*, since the components of R in the tangent plane to S are zero. 


Sec. 80] APPLICATIONS OF LAGRANGEAN EQUATIONS 223 


We parametrize the surface by setting u! = r, u? = 0, substitute in 
(80.17) for T and F, = —dV/du* and obtain two equations 


> 


2 \ 22 
(80.20) s ai 


d 2f 
— (r°6) = 0. 
TA ) 


The second of equations 80.20 gives on integration the equation of 
angular momentum 


(80.21) r6=h,  h-= constant. 


The elimination of 6 from the first of equations 80.20 with the aid of 
(80.21) gives 


2 2 
(80.22) (+2 )r+2-5--F, 
a a 


which has a unique solution when the initial position r = ro and the 
initial velocity 7 = vp of the particle are specified. 

If our particle is constrained to move on a horizontal circle r = constant, 
(80.22) requires that A? = gr*/2a, and equation 80.21 then shows that 6? = 
g/2a, so that the angular velocity 6 is independent of the radius of the 
circle. When the path of the particle is the meridional line 6 = constant, 
we get from 80.20 the equation 


2) r? gr 
1+) ++ 
( 4a? a 


The integration of equation 80.22 and the calculation of the reaction R 
required to constrain the motion to the paraboloid is tedious. To com- 
pute the magnitude of reaction, we need equation 80.15, in which F = 
mg +R. 

If we replace the surface of the paraboloid in this illustration by the 
surface of the sphere, we have the problem of a spherical pendulum. 
The solution of equations of motion for the spherical pendulum can be 
obtained with the aid of elliptic functions. When the surface is a cylinder 
r = a, the integration of equations of motion is easy.° 


5 Interested readers are referred to pp. 99-109 in E. T. Whittaker’s Analytical 
Mechanics, Cambridge Press (1917), where the motion of the particle on a surface is 
analyzed in a different way. 


224 ANALYTICAL MECHANICS [CHap. 4 


Problem 
Let a particle of mass m be constrained to move on the surface of a sphere of 
radius a. Relate the orthogonal cartesian coordinates yê to the surface coordi- 
nates u” by the formulas 
y! = asin u! cos u?, 
y2 = a sin uv sin xê, 
y? = a cos u. 


Show that equations 80.17 yield 


x el: i.e 
ül — (ù?) sin u! cos u! = er 
a 


Fz 
ü? sin? u! + 2utu? sin ut cos u! = ae 


Solve these equations for the case when F* = 0, and show that the trajectory is 
an arc of a great circle, and the speed v = const. 

Hint: The first integral of the second equation is u? sin? ut = constant. Use 
this result in the first equation and observe that t? = a[(1)? + (u)? sin? u`]. 


81. The Symbol of Variation 


In this section we recall the definition of the variational symbol 4, first 
introduced in Sec. 56, and record several of its properties. The notation 
introduced here permits one to give a concise formulation of Hamilton’s 
principle and Lagrange’s principle of least action. Either of these principles 
(rather than the Newtonian laws) can serve as a Starting point in the 
development of analytical dynamics. 

Let F(z’, z?,...,2%") be a function of n independent variables x of 
class C? in some region R of an n-dimensional manifold. We shall be 
concerned with the behavior of the function F in a certain neighborhood 
of the curve C, defined by the parametric equations 


C.e=x(t), totch, 


where we assume that the 2‘(t) are of class C2. 
Consider an h-neighborhood of the curve C, defined by the inequalities 


tt — h<i gt! toh, (021, . Pi), 


where h is a small positive number, and the x’ are the coordinates of a 
point on C. We introduce a class of functions 


Cr 2G 6) =) + 20), (i stl See 


Sec. 81] THE SYMBOL OF VARIATION 225 


where —1l < e < l and the ¢'(t) are single-valued functions of class C? 
in 4 < t < t, such that 


&(t,) = E) = 0 


and |&*(t)| < h, uniformly in f < t < fp. 

A set of n functions z'(t, €) constitutes a varied path, and it is clear that 
the curves C’ so defined can be made to belong to the h-neighborhood of 
C. In the space of two dimensions the curves C’ all lie in a band of width 
2h about the curve C and coincide with C at the end points of the interval 
(tis 12). 

The variation 62‘ was defined in Sec. 56 by the formula 


(81.1) ie 


and the variation ôF of the function F(z',..., x”) is 


oF = (=), 
de /o 


where 
(E) Fee Sam 
de /o de <0 Oat 
Thus 
(81.2) OF = OF bx’. 
Ox 


Consider next the function #‘(t) = dz'/dt. We form 
z(t, €) = a(t) + &(0), 


and conclude from definition 81.1 that 


Ms aaea) y an 

bat = (9 Se ae 

a de /o EW dt 5 
Hence 


(81.3) 6— = — ôr, 


so that the variation of the derivative is the derivative of the variation. 
Clearly, if we have a function F(z?,..., 2", #,--.,%", t) of 2n + 1 
variables a’, ¿t = dx'/dt, and t, which is of class C2, we can write 


OF aa! + GE ya 


ax! Oat 


(81.4) OF = 


226 ANALYTICAL MECHANICS [CHAP. 4 


A simple calculation, analogous to that used in deducing formula 81.3, 
leads to the conclusion that 


(81.5) 67 = 7 OF, 


and one can readily show with the aid of (81.4) that 
O(F + D) = ôF + 6, 
ô(FD) = FoD + DOF, 
where F and ® are any functions satisfying the conditions laid down 


above, and the variational symbol 6 refers to the same varied path C’. 
In Sec. 57 we considered the functional 


t2 
=i PG; 2s, Cee eae eee 
ti 


where the functional arguments x'(t), fi < t < tẹ}, belonged to the A- 
neighborhood of an extremal of J. That is, we considered the behav ior of 
the integral J along the varied paths z(t, €) = x’ + e&*(t). Making use of 
equation 81.4 of this section and referring to formula 57.6, we see that 
formula 57.6 can be written 


=| (Eor + Z oe!) di, 


so that, for a pair of fixed limits ¢, and tz, 


ts ta 
ôJ =| ôF dt = af F dt. 
, ty 


When stated in words the foregoing equation reads: The variation of the 
integral with fixed limits is equal to the integral of the variation of the 
integrand. 

We shall make use of the symbolism introduced in this section to 
formulate Hamilton’s principle. 


82. Hamilton’s Principle 


Consider a particle of mass m moving in a three-dimensional Euclidean 
manifold, referred to a curvilinear system of coordinates ¥. The particle 
is in motion under the influence of force F, and our problem is to determine 
the trajectory 


C: at = x(t), (i = l, 2, oy h < t < te, 


where ¢ denotes the time. 


Sec. 82] HAMILTON’S PRINCIPLE 227 


The kinetic energy T of the particle (which has a physical meaning only 
along the trajectory C) is given by the formula T = jmg,;%'t’. If we 
define a family of varied paths 


v 


C TAG DE O E, 
with on) = e and &'(t,) = e) = 0, belonging to the A-neighbor- 
hood of C, we can speak of the variation of T, namely, 
oT PERN oT 


E Ox* 


(82.1) ôT = on 
and we can phrase Hamilton’s principle as follows: 

HAMILTON’s PRINCIPLE. Ifa particle is at the point P, at the time t 
and at the point P, at the time t,, then the motion of the particle takes place 
in such a way that 


t2 
(82.2) Í (ÔT + F; ôx’) dt = 0, 
ti 


where xi = x'(t) are the coordinates of the particle along the trajectory and 
xi + dx! are the coordinates along a varied path beginning at P, at time tı 
and ending at P, at time tz. 

It will be shown next that this principle is equivalent to Lagrangean 
equations of motion 79.2, and hence to Newtonian laws. The proof is 
simple. Substituting (82.1) in (82.2) yields 


te . s . 
(82.3) | p MERLET ox!) r 
ty 


ox" Ox 


Integrating the first term under the integral sign of (82.3) by parts, 


to to i 
-Í 4 (27) dx’ at, 
Edi CENI 
and, since dx‘(t.) = dx'(t,) = 0 by virtue of &'(t2) and &(t,) vanishing, 
equation 82.3 becomes 


te : 
(82.4) Í (F; TOE 42r) he de =. 
ty 


te : : 
OT Sit dt = 2i ba! 
x 


ty ox" 


ðr? = dt 0x" 


Since this integral vanishes for arbitrary da’, the argument used in Sec. 57 
shows that 


(82.5) oe. (i 123) 


228 ANALYTICAL MECHANICS [CHapP. 4 


Conversely, if Lagrangean equations 82.5 hold, then equation 82.4, and 
hence equation 82.2, is valid. 

In the foregoing formulation of Hamilton’s principle no reference is 
made to the nature of the force field F,. If, in particular, this field is 
conservative, then there exists a potential function V(x", x?, x”) such that 
oV/dx’ = —F,. In this case equation 82.2 reads 


t2 
{ (or — Æ 42" dt = 0, 
tı Ox" 


and, since dV = (0V/dx*) da’, we have 


(82.6) Í "S(T — V) dt = 0. 


But in Sec. 79 we defined the Lagrangean function L = T — V, so that 
t 

equation 82.6 can be written as| ye dt = 0, and, since the limits of 
ty 


integration are fixed, we have a concise formulation of Hamilton’s principle 
for a conservative field in the form 


t2 
(82.7) ô | it = 01 
ti 


We can state equation 82.7, in words, as follows: In a conservative field of 


te 5 
force a particle moves so that the integral | L dt, evaluated along the 
t 


trajectory xt = x'(t),t; < t < ta, has a stationary value in comparison with 
its values for all neighboring paths beginning at the point P, at t = t, and 
ending at point P, at t = tp. 
Equations of motion in form 79.5, namely, 
4 (22) IL 
dt\dzx' REH 


Li 


follow at once from the formulation 82.7. 


83. Integral of Energy 


We establish in this section an important general 

THEOREM. The motion of a particle in a conservative field of force is 
such that the sum of its kinetic and potential energies is a constant. 

The proof of this theorem follows from an identity which will be 
established next. 


SEC. 84] PRINCIPLE OF LEAST ACTION 229 


Since the kinetic energy T = mg, ġġ is an invariant, 


so that 


(83.1) dlse. ma,v’, 


where v’ is the velocity and a; is the acceleration of the particle. 
For a conservative field of force, ma; = F; = —dV/dx', and we can 
write (83.1) as 


dT a OV dx’ 
dt — Ox dt’ 
or 
(83.2) po oN 
dt dt 


Integrating (83.2) yields the result 
T+V=h, 


where h is a constant of integration. 


84. Principle of Least Action 


The history of science abounds in attempts to imbed the laws of nature 
in the structure of theology. Several of these, based on the minimal 
concepts, such as Heron’s (100 B.C.) doctrine of the shortest path and 
Fermat’s (1601-1665) principle of least time, had an innate esthetic appeal 
to mathematicians. The most celebrated of such attempts, in the domain 
of mechanics, is the doctrine of least action propounded by P. M. L. 
Maupertuis circa 1740. Maupertuis asserted that all activities of nature 
are performed with the least possible expenditure of “action,” which he 
defined as the product of mass, velocity, and distance. In order to fit his 
principle to the known results of mechanics, Maupertuis was obliged to 
alter the definitions of the quantities entering in the product mvs so as to 
suit each problem under consideration. Thus, in the anlaysis of inelastic 


230 ANALYTICAL MECHANICS [CHaP. 4 


collision of two particles of masses m, amd mọ, moving with velocities v, 
and v,, he minimized the product mvs, where s was the distance per unit 
time. This made the “action” proportional to the kinetic energy. Mau- 
pertuis obtained the known correct expression for the final common 
velocity, v = (mv, + mv.)/(m, + m). On the other hand, in the problem 
of refraction of light passing from one optical medium to another he used 
the actual distance s and got the constant (but incorrect) value for the ratio 
of the sines of the angles of the incident and refracted rays. The doctrine 
of Maupertuis, who believed that it furnished a scientific demonstration of 
the existence of God, excited the imaginations of Daniel Bernoulli and 
Euler and was defended by them. In 1744, Euler showed that the integral 
f mv ds has a stationary value along the trajectory of a particle moving in a 
central field of force. In 1760, Lagrange extended Euler’s result by demon- 


strating that the integral A = ‘my » ds has a stationary value along the 
ir 
trajectories of particles moving in a conservative force field, provided that 
the constraints are not functions of the time. This led him to formulate the 
principle of least action. This formulation still left a great deal to be 
desired from the point of view of clarity, and Hamilton, in an attempt to 
understand Lagrange’s formulation of the principle, deduced a broader 
and different principle (1827) discussed in Sec. 82. The proof of the 
Lagrangean principle, which put it on a secure basis, was supplied by 
Jacobi. 
Let us consider the integral of Lagrange 


(84.1) A= mv + ds, 
evaluated over the path 


C: af sania E 


where C is the trajectory of the particle of mass m moving in a conservative 
field of force. We suppose that neither the kinetic energy T nor the 
potential energy V is a function of time. 

In curvilinear coordinates the integral 84.1 assumes the form 


Ps dz 


A =| mg,;— dx 
Pı dti 


t(l Pa) dx’ dxi 
= Mg; — —— dt, 
t(Pi) dt dt 


SEC. 84] PRINCIPLE OF LEAST ACTION 231 


and, since 


m ddi 
2° de dt’ 
we have j 
t( Pa) 
(84.2) die | oT dt. 
HP) 


This integral has a physical meaning only when evaluated over the tra- 
jectory C, but its value can be computed along any varied path joining 
the points P, and P,. Let us consider a particular set of admissible paths 
C’ along which the function T + V, for each value of parameter t, has the 
same constant value A. The functional A so determined is called the action 
integral, and concerning it we can formulate 

THE PRINCIPLE OF LEAST ACTION. Of all curves C’ passing through P, 
and P, in the neighborhood of the trajectory C, which are traversed at a rate 
such that, for each C’, for every value of t, T + V = h, that one for which 
the action integral A is stationary is the trajectory of the particle. 

When stated in the form of the variational equation, this principle reads 


Ps) 
(84.3) J 2Tdt = 0, 
t( Pı) 
with the auxiliary condition 
(84.4) T+V—h=0, onC’, 


It is important to recognize that in this instance we cannot determine the 
extremals of the action integral by setting F in the Euler equations 57.7 
equal to 27, because of the auxiliary condition (84.4). Since Tis a function 
of the velocity v, and V is a function of position alone, the times t(P,) — 
t(P,) required to traverse the varied paths C’ will differ in general. Thus 
the upper limit (P2) in the integral 84.4 is not fixed. In this case we have 
the problem in the calculus of variations with variable end points and with 
one auxiliary condition 84.4. The procedure employed in solving this 
problem makes use of Lagrange’s method of multipliers for a problem 
with nonholonomic constraints, which we briefly indicate. (Compare 
Sec. 57.) 

We construct a function F = 2T + Ad, where 6 = T+ V — h, and 
determine the solution of the system of four equations 


OF _ (25) — (eed, 2, 3), 
Ox’ = dt \0a' 
T+V—h=0. 


6 Strictly speaking this principle should be called the principle of stationary action. 


232 ANALYTICAL MECHANICS [CHAP. 4 


An investigation of this system shows that’ A(t) = —1, and it follows 
from this fact that the trajectory C is determined by the solution of the 
system 


(84.5) 


22 ee G1 


These are precisely the Lagrangean equations of motion. l 
A different and somewhat more illuminating mode of attack on this 

problem is to reduce it to a consideration of the variational problem with 

fixed end points by a change of variable. Since the kinetic energy 

w] 
2 at dt 2 \dt!’ 
m 
84.6 dt = J ads 
(84.6) T 


= = ds. 
2(h — V) 


Consequently the action integral 84.2 can be written® 


(84.7) ye i | ae 


since along all admissible paths T = h — V. The integrand in the pre- 
ceding integral is clearly independent of t. We now parametrize our varied 
paths C’, so that 


C: x= x(u), mucus, 
where P,: x‘(u,) and P,: x*(u,), and write 
ds =V aye 2” du; 
where x° = dz'/du. 
This permits us to write the action integral 84.7 in the form 


s io aaee 
(84.8) A = |> V2m(h — V)g;;z"x'" du, 
2 


and, since the limits of integration in (84.8) are fixed, we see that the 
determination of the trajectory is equivalent to finding the geodesics in a 
three-dimensional Riemannian manifold with the arc element 


(84.9) dS? = 2m(h — V)g,; dx’ de’. 


7 See Sec. 88 below and O. Bolza, Vorlesungen über Variationsrechnung, p. 586. 
ë The form 84.7 of the action integral was used by Jacobi. See a discussion of this 
integral and its generalizations in C. Carathéodory’s Variationsrechnung, pp. 255, 290. 


SEc. 85] SYSTEMS OF PARTICLES 233 


If we form Euler’s equations 


Jp cae — 0, 
du 


. [a T E O E 
with F = V 2m(h — V)g,,2"'x", and take cognizance of equation 84.6 in 
the form 


we get the desired equations 84.5. 
We see from formulas 84.8 and 84.9 that the action is equal numerically 
to the length of the curve in a Riemannian manifold with metric coefficients 


h;; = 2m(h — V)eg;;, 


and that the trajectories in E, correspond to the geodesics in a Riemannian 
space metrized by the formula dS? = h, dx‘ dx’, This geometrization of 
dynamics had a far-reaching effect on the developments in relativistic 
dynamics. 


85. Systems of Particles. Generalized Coordinates 


We have already remarked (in Sec. 77) that the passage from mechanics 
of a single particle to mechanics of material bodies can be accomplished by 
introducing certain assumptions regarding the nature of constraining 
forces operating on particles making up the body. In some dynamical 
problems the change of shape of the body is so slight that one is justified in 
supposing that the particles remain at fixed distances from one another. 
This assumption leads to the dynamics of rigid bodies. \f a body suffers 
nonnegligible deformations we can postulate, with varying degrees of 
realism, the nature of constraining forces and thus arrive at the dynamics 
of elastic bodies, ideal fluids, viscoplastic media, and so on. The assump- 
tions concerning the nature of constitutive forces permit us to characterize 
the positions of a large number of material particles in terms of relatively 
few descriptive parameters. Thus a thin rigid rod of length /, moving in 
space, requires only five parameters for the determination of its position. 
These can be taken as space coordinates of its center of mass and two 
direction ratios of one of the ends relative to the center of mass. The 
choice of descriptive parameters is not unique, and they clearly need not 
have the dimensions of length. A bead sliding on a curved wire requires 
only one parameter for the description of its location, say the distance from 
some fixed point on the wire; a particle moving on the surface is located 
unambiguously by a pair of Gaussian coordinates. Whatever is the nature 


234 ANALYTICAL MECHANICS [CHAP. 4 


of descriptive parameters, they will be termed the generalized coordinates. 
Clearly, if the characterization of dynamical systems is to be complete, the 
generalized coordinates must be functionally connected with the space 
coordinates of particles making up the system. 

Let there be N particles composing a system, and let za (= 1, 273) 
(a = 1,2,..., N), be the positional coordinates of these particles referred 
to some convenient reference frame in E;. The system of N free particles 
is described by 3N parameters. If the particles are constrained in some way, 
there will be certain relations among the coordinates x/,,, and we suppose 
that there are r such independent relations. 


rl a A wt 2 a ane 2 3an 
(85.1) f (tūr Za» Lays Tie Y T3- -3 TiN) TIND ziy) = 0, 
e a N): 


If these r equations of constraints 83.1 can be solved for some r coordinates 
in terms of the remaining 3N — r coordinates, the latter can be viewed as 
the independent generalized coordinates g’. It is more convenient, how- 
ever, to assume that each of the 3N coordinates is expressed in terms of 
3N — r = n independent variables g’, and write 3N equations 


(85.2) Zip) Weg, .- sa, 1), 


where we introduced the time parameter ¢ which may enter in the problem 
explicitly if one deals with moving constraints.” If tf does not enter ex- 
plicitly in equations 85.2, the dynamical system is called a natural system. 

We will suppose that the functions xi} = z',(g, t) are of class C? in 
the region of definition of the variables g' and r and that the Jacobian 
matrix (0x'/0q’) is of rank n [cf. (75.5)]. 

The velocities of the particles are given by differentiating equations 
85.2 with respect to time. Thus 


T ify = Sela gs q Beta 

ðq’ Ot 
We shall call the time derivatives g' of generalized coordinates g’ the 
generalized velocities. 

Occasionally, for symmetry reasons, it is desirable to introduce a 
number of superfluous coordinates q', and describe the system with the 
aid of k > n coordinates qg',...,q*. In this event there will exist certain 
relations of the form 
(85.4) SCC O: 


° For example, a bead sliding on a wire while the wire itself is moving with specified 
velocity. 


SEC. 86] EQUATIONS IN GENERALIZED COORDINATES 235 


so that the quantities q‘, and hence g', are no longer independent. There 
will be relations among the q’s of the type 


(85.5) eo, 
oq Ot 

when the f’ are differentiable. 

l Since equations 85.5 were obtained by differentiating equations 85.4, 

it is clear that they are integrable, so that one can deduce from them 

equations 85.4, and use them to eliminate the superfluous coordinates. 

In some problems, however, functional relations of the type 


(eG) FE (Or ies. g E t=O; (Gj =1,2,...,m), 


arise which are nonintegrable, that is, it may be impossible!’ to deduce 
from these differential equations solutions of the type 85.4. The behavior 
of the system in such event cannot be described with the aid of fewer than 
k coordinates, so that all k coordinates are independent. If nonintegrable 


‘relations 85.6 occur in the problem, we shall say that the given system has 


k — m degrees of freedom, where m is the number of independent non- 
integrable relations 85.6 and k is the number of independent coordinates. 
The dynamical systems involving nonintegrable relations 85.6 are called 
nonholonomic to distinguish them from holonomic systems in which the 
number of degrees of freedom is equal to the number of independent 
generalized coordinates. In other words, a holonomic system is one in 
which there are no nonintegrable relations involving the generalized 
velocities. 

In the following section we derive the Lagrangean equations for a 
holonomic system, and in Sec. 88 we treat briefly one important class of 
nonholonomic systems occurring frequently in applications. 


86. Lagrangean Equations in Generalized Coordinates 


For concreteness of presentation the definitions of Sec. 85 were intro- 
duced with reference to systems consisting of a finite but, perhaps, large 
number of particles. These definitions can be readily extended to apply 


10 A billiard ball rolling and spinning on a rough table is an example of this situation. 
To specify the position of the ball one needs five generalized coordinates; two of these 
may locate its center, and three the angles describing the orientation of the ball relative 
to the center. Since the table is rough, the ball cannot slip, so that both velocity 
components of the point of contact must vanish. This gives two constraining relations 
of the form (85.6), involving the velocity components. They are nonintegrable, since, 
at any position of the center, the orientation of the ball can be changed without violating 
the constraints. 


236 ANALYTICAL MECHANICS [CHaP. 4 


to continuous bodies, the points of which have coordinates 2” relative 
to some reference system X. 

The particles of a continuous body are subjected to constraints of 
various sorts, and we shall suppose throughout the remainder of this 
chapter that the bodies under consideration are rigid, so that the material 
points remain at invariable distances from one another. If the points of 
the body are uniquely determined by a finite number of generalized co- 
ordinates qf, we will write 


= 2G Gd CS 41, 2s), 


and assume, as in Sec. 85, that the functions 2z’(q, t) are of class C?. The 
velocity x” of any point of the body is given by 


qg’ dt Ot 
Ox" Ox" 

= — gq’ Sa j = iL. = s 
agi >) (j n) 


where the g’ are generalized velocities. 
Let the system in question be natural, holonomic, with n degrees of 
freedom, so that the relations 


(86.1) L = BG... 5G") 

involve n independent parameters q. The velocities #” in this case are 
given by [cf. (80.11)] 

(86.2) =L (r=1,2,3;7=1,2,...,n), 


where the g’ transform under any admissible transformation 
(86.3) G=7@,..-.9s (Ki es): 


in accordance with the contravariant law. 
The kinetic energy of the system is given by the expression of the form 


(86.4) T=4)> manEntianta (7.5 = 1,2, 3), 


where m is the mass of the particle located at the point x” and the summation 
(or integration) is carried over the entire region occupied by the body. 
The g,, in (86.4) are the components of the metric tensor associated with 
the coordinate system X covering Ez. 


Sec. 86] EQUATIONS IN GENERALIZED COORDINATES 237 


If we insert in (86.4) the values of 2’ from (86.2), we obtain! 


fale” om ae 
t= 3 > mg aq 
a 2, Briana gyi 44 
= ha,;,g'q’, 
where ile 
oat (Fs lhe 3) 
a oq 3 
(iy =... A); 
Since 
(86.5) T= 34,99 


is an invariant, and the quantities a, are symmetric, we conclude that the 
a; are components of a covariant tensor of rank two with respect to a class 
of admissible transformations 86.3 of generalized coordinates. We note 
that, since the kinetic energy T is a positive form in the velocities q’, 
|a;;| > 0, and we can construct the reciprocal tensor a”. 

If we carry out a computation, in every detail identical with that of 
Sec. 79, by using the expression for the kinetic energy in the form 86.5, 
we obtain the formula 


(= oT G N ooa 
86.6 T) -maala + ar). 
C ma gg NT E et 
; jN 
where the Christoffel symbols \ incl are constructed from the tensor ay, 
aX} 
We denote the expression appearing in the parentheses of the right-hand 
member of (86.6) by 


7 a 
Q'=ġğ + | jag 
a jk 
and write equation 86.6 in the form 
86.7 =A — — An 
ee) dt\aq') aqi 
=O Bb 2a n) 


The expression in the left-hand member of (86.7) can also be computed 
by starting with formula 86.4 and by taking cognizance of the dependence 
of the variables x‘ on the parameters q’. A straight-forward but somewhat 
lengthy computation making use of the formula 02" /Og? = Ox" /0q’ and the 


11 For simplicity in writing we omit the subscripts « in terms affected by the symbol 2. 


238 ANALYTICAL MECHANICS [CHAP. 4 


G 02a" 7 oi d Ox" 
— , following from equation 86.2, 
relations — Ta ar: ag? Og gq’ a ee wait agi’ 8 q 
leads to the result 
d [T` or Ox" 
86.8 4 (=) — — = } ma,—, 
oe dt \dq° 0q° > dq’ 


in which a, = g,,a° is the acceleration of the point P(x). 
On the other hand, Newton’s second law gives 


(86.9) ma, = F,, 


where the F,’s are the components of force F acting on the particle located 
at the point P(x). It follows from (86.9) that 


Ox" Ox" 
moa = > Foe, 
2 og > oq’ 
and hence equations 86.8 can be written 
ra} jä 
(86.10) 2/7) aP S 
dt\0q‘) d T ‘aq 


Comparing (86.7) with (86.10), we conclude that 


Ox 
Q; ae 2 1 T 3 
in which the vector Q, is called generalized force. 
The equations 
=) - oT 


86.11 A 
( ) ae dq 


are known as Lagrangean equations in generalized coordinates. They yield 
a system of n second-order ordinary differential equations for the general- 
ized coordinates q’. The solutions of these equations in the form 


C: qg =g (6) 
represent the dynamical trajectory of the system. 
If there exists a function F(q}, . . . , q”), such that 


=Q; 


Sn ae =O, 


the system is said to be conservative, and for such systems equations 
86.11 assume the form 


(86.12) £ (2) = ae = 0, 
dt \dq' oq 
where L = T — V is the kinetic potential. 


Sec. 86] EQUATIONS IN GENERALIZED COORDINATES 239 


Since L(q,q) is a function of both the generalized coordinates and 
velocities, 


ang Oe aon, 


dt agit agit i 
Inserting in this expression from Lagrangean equations 86.12, we get 
aL oOL., , CUIN G 
(86.13) —=— G+ AA j 
dt ağ at\agi)? 


But, since L = T — V, and the potential energy V is not a function of the 
į’, 

OL m 

-r fim 27. 

og’ 0g 
since T = 3a,,g'¢’. Thus equation 86.13 can be written in the form 


d(L— 2T) ae. d(T + V) =0 
dt dt i 


which implies that T + V = h (constant). Thus, along the dynamical 
trajectory, the sum of the kinetic and potential energies is a constant. 

It follows from this development that the study of natural holonomic 
dynamical systems with n degrees of freedom can be reduced to a study of 
motion of a single particle in the n-dimensional space. 

We can phrase the problem of determining the dynamical trajectory of 
the system in the language of calculus of variations. Indeed, the state- 
ments of Hamilton’s principle and of the least action principle, given in 
Secs. 82 and 84, can be repeated word-for-word if the “point” is interpreted 
to mean a set of n parameters q}, . . . , q”, specifying the configuration of 
our dynamical system in a certain n-dimensional space. 

In symbols the principle of Hamilton reads 


ta ee 
(86.14) Í (ôT + Q; ôq’) dt = 0, 
ty 


and, if the force field Q, is conservative, the principle can be stated in the 
form 
te 
6| Ldt=0. 
tı 
These variational equations imply the satisfaction of Lagrangean equations 
86.11 and 86.12. 


240 ANALYTICAL MECHANICS [CHAP. 4 


It follows at once from the formulation of the principle of least action in 
generalized coordinates (cf. equations 84.3 and 84.4) that dynamical 
trajectories in a conservative field are geodesics in the n-dimensional 
Riemannian manifold with the arc element dS given by 


dS? = 2(h — V)a;,; dq’ dq’. 
The fact that the dynamical trajectory can be regarded as a geodesic 
permits one to geometrize dynamics. 
Problems 
Show that the dynamical equations in spherical coordinates with 


ds? = (dr)? + r2(d6)? + r? sin? 6 (dd)? 
assume the form 
av 
m(F — rô? — rẹ? sin? 0) = — an 
r 


1d 13y 
a E E E A 
m -5e ) — rẹ? sin 6 cos | oa” 


d 1 av 
Se int) | a 
m|- sin 0 dt ah | rsin@ ô$’ 
whereas in cylindrical coordinates, with ds? = (dr)? + r?(d0)? + (dz)*, they are 
OV 
mi — r8) = —-—, 
or 
(6) 1 ov 
ies = —=—— 
dt” r 00” 


87. Virtual Work and Generalized Forces 


In the developments of the preceding sections no characterization of 
forces F, acting at a point (x”) of a rigid body was made. It is customary 
in the study of mechanics of continuous media to classify forces into three 
categories.” 


(a) Internal constitutive forces. 
(b) Reactive forces produced by constraints. 
(c) External impressed forces. 


i s ; 
The reactive forces produced by constraints are also external forces. 


Sec. 87] VIRTUAL WORK AND GENERALIZED FORCES 241 


We can visualize a material body as being composed of a vast number of 
particles which interact with one another in a rather complicated way. 
As long as the constitutive internal forces are of the action-reaction type, 
they need not be taken into account in the dynamical eqtations, since 
their resultant at any point P of the body vanishes. Thus the forces F,, 
appearing in the formulas of Sec. 86, consist 
of reactive forces produced by constraints and 
external impressed forces. 

To illustrate the meaning of this we can con- 
sider a rigid body fixed at some point O by 
a smooth pin, and subjected to the action of 
impressed force F, (see Fig. 37). The pin at O 
constrains the motion of a body to that of 
rotation about the point O. The reactive force 
R, acting at O does no work if the body is 
displaced so as not to violate the constraints at O. 
We shall term all reactive forces that do no work 
in an arbitrary displacement which does not Fig. 37 
violate the constraints workless forces. Any dis- 
placement of a point of a body that is consistent with imposed constraints 
is a virtual displacement," and we denote such virtual displacements at a 
point x” by ôx”. 

The work done by the impressed forces F, in a virtual displacement 
ôx” is 
(87.1) Wi, = 2. 8, or", vie ih 
where the summation is carried over all particles of the body; this will be 
the total work if the reactive forces are of the workless type. We define W, 
to be the virtual work in producing a virtual displacement 62”, provided 
that the reactions are workless. Otherwise, W, will also contain contri- 
butions from the working reactive forces. 

It should be noted carefully that a virtual displacement dx” is not - 
necessarily the actual displacement dz” that the point P(x’) undergoes under 
the action of specified forces. It is merely any conceivable displacement that 
a body can perform without violating the constraints. 

If a given natural holonomic system with n degrees of freedom is 
described by the generalized coordinates q’, then 2” = 2"(q',..-; q”), and 
the virtual displacements ôx” are related linearly to the generalized virtual 


displacements dq’, namely, 


Ox" 
(87.2) ba” = — dq’. 
oq 
13 Virtual displacements that violate constraints are also used in dynamics, especially 
if one is concerned with the computation of reactive forces. 


242 ANALYTICAL MECHANICS [CHAP. 4 


In formula 87.2 the 6g”s are arbitrary, and they are necessarily consistent 
with constraints imposed on the system, since the coordinates q’ are 
independent.” 
If we insert expressions from (87.2) in (87.1), we get 

on; 
(87.3) Wo = 0g 

0q’ 

= Q; ôg’, 

where the last step makes use of the definition of the generalized force Q;. 
It follows from this formula that one can calculate the generalized forces 
Q;, acting on the system, by computing the work W, produced by dis- 
placing the system through a virtual displacement dq’ ¥ 0, (j fixed), and 
with 6g‘ = 0, i Æ j. Then Q, = W,/6q’. We shall resort to this method 
of computing generalized forces in the illustrative examples of Sec. 89. 


88. Nonholonomic Systems 


The derivation of Lagrangean equations in Sec. 86 is based on the 
assumption that the dynamical system is holonomic and that its config- 
uration is described by n independent generalized coordinates g'. When 
the q? are not independent, the derivation of appropriate dynamical 
equations from Hamilton’s principle (86.14) hinges on general consider- 
ations presented in Sec. 57. . 

In dealing with nonholonomic dynamical systems it is customary to 
assume that the generalized velocities ġ* enter in the constraining relations 
linearly. Accordingly, we shall suppose that n generalized coordinates q 
satisfy m < n conditions of the type 


(88.1) c.(¢',...,9%q* = 0, Ga oe G= 1, amn: 


in which the coefficients c,; are continuously differentiable functions of the 
variables q’. 
The set of m equations 88.1 can be written in the form 


Cy’ Ot = 0, 
and, since g* ôt = dq’, we have m relations 
(88.2) Cpi Og’ = 0, 
in which the variations 6g’ in general are not independent. 


“ We call attention to the distinction between the virtual displacements 6g‘ and the 
actual displacements dq‘ taking place along the dynamical trajectory q° = g‘(t). 


Sec. 88] NONHOLONOMIC SYSTEMS 243 


To deduce the dynamical equations from Hamilton’s variational 
equation 


tg 
(88.3) | (ôT + OQ, 6q') dt = 0, f 
ti 


in which the 6g’ are constrained by m relations 88.2, we introduce (cf. 
Sec. 57) m unknown functions A*(q!,..., q”) and form with the aid of 
(88.2) the sum 
(38-4) Mea = 0, (E ee :)) Pali (= a 
Since T = (07/0q') 6q' + (OT/0q') òq’, equation 88.3 yields 

eC a a ; 
(88.5) | Z OF ar = + o, 54' de = 

t \ðġ ðq’ 
But, dg’ = (d/dt) ôq', and the integration by parts of the first term in the 
integrand of (88.5) gives (cf. 82.3) 


“ar dor ; 
88.6 i (Z - — — Jó “ai = 
oe) a \dg* dtdq’ o 


when we recall that 69'(f,) = 6q'(t2) = 0 along each varied path. 
We rewrite (88.6) by inserting in the integrand the term A*c,,; ôq? = 0, 


8/0T d OT ; 
S 
en a \dq' eae uO al 


In formula 88.7, the q’ are constrained by m relations 88.4, and, if we agree 
to consider the first n — m coordinates g' as independent variables, and 


suppose that m functions A*(q', . . . , q”) can be chosen so that 
(88.8) U EE a, = 6, fori=n—m+1,...,n, 
dq’ dt dq’ 


then (88.7) reduces to 
t 
(88.9) i Z we oe +Q;+ ie) Gur dt = 0, 
n \0q? dt dg’ 
(i= 1,2,...,n — m). 


Since the first n — m variables q’ in the integrand of (88.9) are independent, 


the variations 6q' for i = 1, 2,...,m — m can be chosen arbitrarily, and 
we conclude that 
(88.10) oT _49T , go + ate, = 0, G—1,2,.-.,8 — m). 

3q? dtag’ 


The two sets of equations (88.8) and (88.10) involve n generalized co- 
ordinates qf and m Lagrangean multipliers 2*(q*, . . - q”). By adjoining 


244 ANALYTICAL MECHANICS [CHAP. 4 


to these equations m equations 88.1, we get n + m equations for the 
determination of the g’s and /’s. 

The circumstances under which the 4* can be determined so as to satisfy 
(88.8) were detailed in Sec. 57; they relate to the rank of the Jacobian 
matrix for (88.1). 

We note that when the equations in (88.8) and (88.10) are written as a 
single set 


(88.11) dt agi aq! = Once. G= 1322 2a 

the right-hand member of (88.11) differs from the right-hand member of 
(86.11) by the term R; = "cw This term corresponds to the generalized 
reactive forces produced by constraints when the Q, are generalized forces 
that act on the system in the absence of constraints. 

In special situations the Q, may be derived from potential V(q',...,q"). 

As an illustration of the use of equations 88.11, we consider a homo- 
geneous circular cylinder rolling under gravity down a rough inclined 
plane. 

Let the cylinder of radius a and mass m roll without slipping down 
the plane making a fixed angle ¢ with the horizontal. The position of the 
cylinder is determined by the angle of roll 6 and by the distance x 
through which the center of mass of the cylinder moves down the plane. 
We shall take as our generalized coordinates q! = 0, q? = x, and note 
that the kinetic energy T of the system is the sum of the kinetic energy of 
translation of the center of mass and the kinetic energy of rotation about 
the center of mass. Thus 


(88.12) T = dnd? + Amk262, 


where k is the radius of gyration of the cylinder. 

Since the plane is rough, there is frictional force F acting in the plane, 
and we suppose that this force is just sufficient to prevent slipping. In 
this event x and 0 are related by 


(88.13) :, add We, 


where a is the radius of the cylinder. 

The constraint (88.13) is actually holonomic since (88.13) can be inte- 
grated to yield x = af, so that the problem can be reduced to the con- 
sideration of one independent variable, say x. However, to illustrate the 
theory of this section, we write (88.13) in the form c,,g’ = 0 [cf. (88.1)], 


(88.14) a _ 4% _ 9 


so that Cy =a, Cig = —li. 


SEC. 88] NONHOLONOMIC SYSTEMS 245 
Equations (88.11) then yield 


Cid ol 


— — — — +0 +2a=0, 
30 6 
(88.15) ae - 
ôT dôT 
———— Pie, 
Ox meet D 


Now, the work W done by the gravitational force alone, when the 
center of mass moves through a distance x is W = amg sin ġ. Hence V = 
—xmg sin ¢, and 
oV oV 
=——=(, ¿= — — = mgsin 9. 

š 06 Q: Ox ene 
On inserting these expressions in (88.15) and using T in the form 88.12, 
we get a pair of equations 


~ M 
(88.16) mo = ja’ më = mg sin ġ —A, 
which, when compared with (88.11), show that the generalized reactions 
R, are 


To compute À, observe that að = x, so that 6 = #/a, and use this relation 
to eliminate # and 6 in (88.16). The result is 


_ mgsin 
E 
and hence equations 88.16 yield 
. mgasin d te . , _ k’mgsin 
(88.17) mÖ = E o r mi = mg sin $ eg me 


The term k?mg sin ¢/(a? + k?), in the second of equations 88.17, represents 
the frictional force F opposing the component mg sin } of the gravi- 
tational force along the plane. If the cylinder is solid, k? = a&?|2, and F = 
img sin ġ. The magnitude of frictional force F = uN, where u is the 
coefficient of friction and N is the pressure of the cylinder on the plane. 
Since N = mg cos ¢, we conclude that u = F/N = } tan ¢. 

As another illustration of the use of equations 88.11 we consider the 
brachistochrone problem in a resisting medium. 


15 See G. A. Bliss, “The Problem of Lagrange in the Calculus of Variations,” American 
Journal of Mathematics, 52 (1930) and L. A. Pars, Calculus of Variations (1962), pp. 


241-243. 


246 ANALYTICAL MECHANICS [CHAP. 4 


Let it be required to determine an arc of a continuously differentiable 
curve 


(88.18) C = y = y(x), y(zı) = Y, Y(X2) = Yo, 


such that the time of descent of a bead of unit mass, moving on C under 
gravity, is as short as possible. We suppose that the motion is opposed by 
a force R(v) per unit mass, where R(v) is a continuously differentiable 
function of the speed v. 

We choose the positive Y-axis in the direction of gravity. Since the 
work done by gravity on the particle less work done by the resisting force 
R(v) is equal to the change in kinetic energy, we have 


2 
= = gdy — R(v) ds. 


If we take x as our independent variable, this relation gives the con- 
straining condition in the form 
(88.19) dy, v, y', v") = we’ — gy’ + ROv)V'1 + (y'? = 0, 


where primes denote derivatives with respect to x. 
The integral to be minimized under the condition 88.19 is 


te z2 z2 2 
(88.20) J =| dt =f ds ofe G, 
tı zı V zı v 


We denote the integrand in (88.20) by F = y 1 + (y’)?/v, and construct 
the function 


G=F+Ad¢ 
= sen + Ax)[vv' — gy’ + Rv)V1 + (y). 
If we define 
(88.21) BETIO 
we can write G as i 
(88.22) © G= HV1 + (YF + Aw’ — gy’). 


The equations of C are determined from Euler’s equations 


dG, 
88.23 a! 
(88.23) 


aGy _ 
d 


SEC. 88] NONHOLONOMIC SYSTEMS 247 


and, since G does not contain y, we conclude from the first of equations 
88.23 that G, = a or 


(88.24) Hy’ 


————. — jg =a, ae 
vi + (y'? 


where a is a constant. 
The second of equations 88.23 yields 


ei) = 1 ER) 
dx 
or 
và (x) 
vE" y 
We thus have the system of three equations 88.19, 88.24, 88.25 for the 
determination of y(x), v(x), and A(z). We can rewrite them as 


(88.25) SN. 


dv dy 

v — = g— — R, 
ds £ ds 
dy 

(88.26) H~ = åg +a, 

ds 

v eA = hi, 
ds 


by setting ds = J1 + (y' de. On eliminating dy and ds from (88.26), 


we get the equation 
H(H, dv + R dd) = (gd + a)g da, 


and since R = H, by (88.21), we can write it as 


H(H, dv + H, da) = (gå + ag dì 


or 
H dH = (gå + a)g da. 


The integration of this equation yields 
(88.27) H? = (gd + a)? + B, 


where b? is the constant of integration. 

It follows from (88.21) that (88.27) is a quadratic equation in A so that 
1 can be regarded as a known function of v and the integration constants a, b. 
This suggests that the equation of C be sought in the parametric form 


(88.28) C: c=2(vr), y= y). 


248 ANALYTICAL MECHANICS [CHAP. 4 


Since dy/dv = dy/ds > ds/dv, we find with the aid of the first two equations 
in (88.26) that 


(88.29) dy. li 


dv g(Ag+a)—RH’ 
the right-hand member of which is a known function of v. On performing 
quadrature we then get 


(88.30) y = fi, a, b) + c, 


where c is a constant. Equation 88.30 is one of the desired equations in 
(88.28). To obtain x = z(v), we note that dr/dv = dx/ds - ds/dv and, since 


dzjds = l= (dy/ds)? and both dy/ds and ds/dv are determined by the 
first two equations in (88.26), we see that dx/ds is also a known function of 
v. The reader will check that 


dx bv 


dv eg + a) — RH’ 
so that 


(88.31) a = f,(v, a, b) + d, 


where d is a constant. The constants of integration in (88.30) and (88.31) 
must be determined for the initial conditions. To make the problem 
physically meaningful we must impose some restrictions on the relative 
magnitudes of R(v) and the gravitational force g, as, for example, R < g 
for all relevant values of v. 


Problems 


1. A hollow cylindrical drum of mass m rolls under gravity down a rough 
inclined plane making an angle ¢ with the horizontal. What must the coefficient 
of friction « be to prevent slipping? (Answer: u > } tan ¢.) 

2. A bead of mass m slides on a smooth rod rotating in a vertical plane about 
one end with constant angular velocity œ. Show that the equation of motion is 
# — wr = g sin wt, and solve it. 

3. A bead slides on a smooth circular wire of radius a, which is rotating with 
constant angular velocity about the vertical diameter of the wire. Show that 


6 — w? sin 0 cos 0 = (gla) sin 6, where 0 is the angle made by the radius to the 
particle with the diameter. 


89. Hlustrative Examples 


We give next three examples illustrating the use of generalized co- 
ordinates. 

Consider first the problem of a simple pendulum, consisting of a bob of 
mass m supported by a light inextensible cord of length /. We shall 


SEC. 89] ILLUSTRATIVE EXAMPLES 249 


y? 


Fig. 38 


suppose that the pendulum is set in vibration in some plane which we take 
as the Y'Y*-plane. (See Fig. 38.) 
In order to form Lagrangean equations 


(89.1) 2 (27) = az Q.. 
dt \dq' 3q’ 
we need the expression for the kinetic energy 
(89.2) T = imy'y'. 
However, 
y = lsin 0 = isin“, 
(89.3) 


y? = I(1 — cos 9) = (1 — cos), 


where we take the arc-length q = /6 as our generalized coordinate. Since 
x = ġ sin q/l and y' = ġ cos q/l, equation 89.2 becomes T = }m(q)”. 
The work W, done in producing a virtual displacement ôq is 


W; = —mg sin 6 ôq 


= —mg sinf ôq, 


and hence the generalized force Q = —mg sin q/l. Thus equation 89.1 
yields 


(89.4) g+es sinf =0, 


250 ANALYTICAL MECHANICS [CHAP. 4 


y! 


m Be 
‘i 
2" (aes) 


2 
Y =g 


Fig. 39 


and, since for small displacements sin 0 = 0, for small vibrations we have 
ğ + kg =0, 


where k? = gji. The solution of this equation is q = a cos (kt + «). The 
solution of (89.4) can be expressed in terms of elliptic integrals of the 
first kind. 

We turn next to a more interesting problem of a double pendulum. 
Consider an arrangement of particles shown in Fig. 39, where we suppose 
that the masses m, and m, are supported by inextensible light cords of 
lengths /, and /,, respectively. The pendulum is assumed to vibrate in one 
plane, and we take as our generalized coordinates the quantities 0 and ¢, 
which give the angular deviations of the cords of lengths /, and /, from the 
vertical. 

The equations connecting the coordinates (y,, y,”) and (y,', Y2), of the 
masses m, and m, with generalized coordinates q! = 0 and q? = ¢ are 


Yr = li sin ais 
n° = 1, cosq’, 
Yor = I, sing’ + l; sin qf, 


Ya = 1, cosq’ + I, cosq’. 
Since 


T = mġġ + dye’ ye’, (i = 1, 2), 
an easy calculation gives 


T = {m (hå)? + mig)? + 2hlaġ'ġ? cos (q? — gq) + (lag). 


SEC. 89] ILLUSTRATIVE EXAMPLES 251 
Now, the work done in a small virtual displacement ôq? when 6q' = Ois 


ie Wi? = —myggl, sing? ôq? 
Q: = —məlg sin q?. 
Also the work done in a displacement ôq! when ôq? = 0 is 
Wi = —(m, + mgl sing’ dq’. 
Thus 
Qı = —(m, + məgh sin qi. 


Making use of equations 89.1, we find a pair of simultaneous ordinary 
differential equations 


é {(m, + Mm) 4! + mlilaġ? cos (q? — q’)} 
89.5) 4 — mlilaġ'ġ? sin (4? — q’) = —(m, + m,)gl, sing’, 
p {mg],1.g* cos (q? — q*) + m,(12)°4"} 
+ mgl,1.g'q° sin (q? — gq’) = — magl sing’, 
for the determination of the dynamical trajectory. 


Instead of determining the generalized forces Q, and Q, directly, we 
could have made use of the potential energy V, which is 


V = megl,(1 — cos 4t) + mg(h + h — l cos q' — l cos q’), 


if we assume V = 0 when g! = q? = 0. 

For a detailed discussion of the solution of the system of differential 
equations 89.5 we refer to standard treatises on analytical dynamics. 

As our final example we consider the problem of small oscillations of a 
conservative dynamical system about the position of stable equilibrium. 

We suppose that the system is natural, holonomic, with n degrees of 
freedom, and select the generalized coordinates q’ so that the equilibrium 
position is given by gq’ = 0, (i = 1,..., n). Since the equilibrium is stable, 


the potential energy V(q', . . . , q”) has a minimum value at q° = 0, and 
4 ' 

hence mig 0. If we choose the potential level to be zero at g' = 0, 
q lo 

then the expansion of V(q', . . . ,q”) in Taylor’s series about g' = 0 has 


the form V = 4b,,q'q' + O(q*), where O(g°) denotes the remainder after 
the second-degree terms in the g’. Since we are concerned with small 
oscillations about the point g‘ = 0, we shall suppose that the potential 
energy is represented with sufficient accuracy by the quadratic form 


(89.6) V = łby, (by = 5 yi). 


252 ANALYTICAL MECHANICS [CHAP. 4 
The kinetic energy T of the system is 
(89.7) T = 34,49’; (a;; = ;,), 


and we suppose that, in the neighborhood of the point q' = 0, the as do 
do not vary appreciably, so that they can be regarded as constants. 

The Lagrangean equations 86.12 now yield the system of n simultaneous 
second-order ordinary differential equations with constant coefficients 


ajf + byg = 0. 
Instead of integrating this coupled system directly we can simplify the 
problem by introducing a new set of independent variables q”, the so-called 
normal coordinates, which are related linearly to the coordinates g' in such 


a way that the quadratic forms 89.6 and 89.7 reduce simultaneously”* to a 
sum of squares. We then have 


T= (ge). SE GF J S Es Cia 
V= Ag) + anaes ae A (qe 
All the coefficients of the g’’s in (89.8) are nonnegative since the quadratic 
form 89.6 is necessarily nonnegative if the potential energy V has a 
minimum at q’ = 0. 
The Lagrangean equations now become 


(89.8) 


G+ A2q'' = 0, (no sum on i), 
and their solutions obviously are 
Gi = ci(cosAt +c), CS paas A 


Thus the oscillation of the system, in terms of the normal coordinates, 
is simple harmonic with normal modes of vibration determined by the 
characteristic values A; which satisfy the frequency equation 


(89.9) [b;; — APa;;| = 0. 


If the roots A; are distinct, the normal coordinates q’ are determined 
essentially uniquely. For multiple roots, the choice of normal coordinates 
is not unique. This follows from the analysis given in Sec. 16. 

The problems of small oscillations are of great technical interest, and 
there is an extensive literature concerned with the study of oscillating 
systems with finite and infinite number of degrees of freedom.” 


** This algebraic problem was considered in detail in Sec. 16. 
17 See, for some interesting examples, Frazer, Duncan, and Collar, Elementary 


Matrices and Some Applications to Dynamics and Differential Equations, Cambridge 
University Press, 1938. 


SEC. 89] ILLUSTRATIVE EXAMPLES 253 


As a concrete illustration of our general discussion of oscillation of 
dynamical systems about the position of stable equilibrium consider the 
double pendulum in Fig. 39 with /, = /, =/ and ny —" meen. ‘The 
expressions for T and V given on pages 250-251 in this cage reduce to 

T = 4Pm[2(ġ!) + 2ġ'ġ? cos (g — q’) + (q*)*), 
V = mgl((1 — cos q') + (2 — cos q! — cos q?)]. 

If we expand T and V in powers of q' and g' and retain only the second- 

degree terms in these variables, we get 


2 
= TRG + 29a + GL 
(89.10) 


= me [2(45? + (4). 


To reduce (89.10) to the form 89.8 we introduce the normal coordinates 
x = q, y = q”? by a linear transformation (cf. Sec. 16) 
1 = az + ay, 
(89.11) T ad 
gq? = bx + by. 
The coefficients a; and b, in (89.11) must be chosen so that T and V in 
(89.10) reduce to 
= 1(72 + 72), 
(89.12) A i 
Vr Ae” 
The substitution from (89.11) in (89.10) yields two quadratic forms in 
which the cross-product terms must vanish. Thus 
2b,a, + 2b,a, = 0, 
4a,a, + 25,6, = 0. 
Solving these, we get 


ee ee 
ay a 
Furthermore, the comparison of coefficients of z? and y? shows that 
2 2 = o i 2 2 + “2 
ay = > a = . 


4ml’ 4ml? 


Thus the desired transformation 89.11 is 


a= (2 N2) 


E 


pa a ee = 2 a+ 4/2 9), 


F) 


(89.13) 


254 ANALYTICAL MECHANICS [CHAP. 4 


under which V assumes the form 
V= 212 — VD + 2 + y3"). 
Accordingly, the Lagrangean equations in normal coordinates are 
# + F(2— /2)x = 0, G+ 22+ V2y=0. 


Solving these, we get 
(89.14) x = c cos (At + co), y = cz COS (Ast + c4), 
where 


es FQ 4/2), P= e + ./2). 


The independent oscillations in (89.14) have periods 7, = 27/A, and 
Ta = 2z/4,. The vibration with the larger period is that of x; it is called 
the grave mode. The rapid mode is that of y. If we set y = 0 in (89.13) 
and consider the grave mode, we see that 


q2 v2 qı- 


The performance of the pendulum in this case is illustrated in Fig. 40a. 
On setting x = 0, we get the motion of the rapid mode for which q, = 


—V2q,. This is shown in Fig. 40b. The angles shown in these figures 


O 
q; 
| q, == V2q, 
(a) Grave mode q, = V24, (b) Rapid mode q,=- v24, 


Fig. 40 


SEC. 89] ILLUSTRATIVE EXAMPLES 255 


are exaggerated. The general motion given by (89.13) is a combination of 
motions of the two characteristic modes. 


One can, of course, get the normal frequencies A,, A, directly from the 
frequency equation (89.9). a 
If we substitute T and V from (89.10) in the Lagrangean equations 
86.12, we get a pair of equations 
2g! + gt + Eq = 0, 
(89.15) 
P+ P+ iq’ =0, 
in which the variables q,' and q,” are coupled. We assume solutions of 
(89.15) in the form 
(89.16) aj? = qe Ga = ae 
and determine A so that equations 89.15 are satisfied. On substituting 
(89.16) in (89.15) we get two homogeneous equations 


a,(22 — 27°) + a,(—A?) = 0, 


a(—1°) + a£ = i) = 0, 
which will have nontrivial solutions for a, and a, if, and only if, 
2g/i — 247 —7? 
-4 gi- #|7 
On expanding this determinant, we find that 
=F + V2), 


which yields two values A,? = (g//)(2 — V2), Ag? = (g/I)(2-+ 4/2) corre- 
sponding to the grave and rapid modes found previously. Thus the solution 
(89.16) can be written 
q’ = ce?” adie 
ge = —/2 e ais J2 geat 


iht 
’ 


as in (89.13). 
Problems 


1. Find the normal modes of vibration for the double pendulum in Fig. 39, 
assuming that /, = /,, but m, # mp. 

2. A particle of mass m oscillates about the lowest point of a smooth surface 
z = (ax? + 2hry + by”), where the coordinates are orthogonal cartesian and 


256 ANALYTICAL MECHANICS [CHAP. 4 


the z-axis is directed vertically up. We suppose that the vertical component of 
the velocity is small, so that T = 4m(a? + y?). The potential V = mgz = 
(mg/2)(ax? + 2hey + by). Obtain equations of motion, determine their solu- 
tions in the form x = a,e’, y = age, and conclude that if V = min at x = 0, 
y = 0, then a > 0, b > 0, ab — k? > 0. 

3. Let the particle in the problem at the end of Sec. 80 be acted on by the 
force of gravity, so that F, = mga sin ut, F, = 0. (Note that the work ôW done 
ir a small displacement ôy? is ôW = —mg ôy? = mga sin wu’ du'.) Show that the 
motion, when the particle passes through the highest and lowest points on the 
sphere, is along an arc of a great circle. A complete discussion of this problem 
is involved. See P. Appell, Mécanique rationelle, 1, Chapter 13, especially Sec. 
277. See also a discussion of the spherical pendulum in J. L. Synge and B. A. 
Griffith, Principles of Mechanics. 

4. Let the particle in the preceding problem execute small oscillations about 
the lower pole of the sphere. Consider projection of this motion on the plane 
tangent to the pole and discuss the motion. 

Hint: Set u’ = w — (r/a), and deduce equations 


r 
ban ee, 


ri + 2ru = 0. 


90.- Hamilton’s Canonical Equations 


Consider a conservative holonomic dynamical system with n degrees of 
freedom and the integral . 


te 
(90.1) y= "La. ade 
ty 


where L = T — V is the kinetic potential. We saw in Sec. 86 that the 
system of Euler’s equations associated with the variational problem J = 
extremum consists of a set of n simultaneous second-order ordinary 
differential equations 86.12, which we write in the form 


a Lyi 
dt q. 
by using the subscript notation for partial derivatives of L(q,g). In 
a variety of considerations it is convenient to rewrite the system of n 
Lagrangean equations 90.2 in the form of an equivalent set of 2n first- 
order equations, known as Hamilton’s equations. 

The function L(q, 4) = Tq, 4) — V(q) depends on n generalized co- 
ordinates q’ and n generalized velocities ġ'. Instead of the variables g' we 
can introduce a set of n new variables p, defined by the relations 


(90.3). aS Li(q, å), C= n), 


(90.2) 


SEc. 90] HAMILTON’S CANONICAL EQUATIONS 257 


where we suppose that the system 90.3 is solvable for the g’ in terms of the 
P: and q'. This, surely, will be the case if the Jacobian determinant 


OL 5 
0g’ 
variables q and p, 

(90.4) H(p, q) = ¢'p; — Lq, å), 

by expressing the g’ = q'(q, p) in the right-hand member of (90.4) in terms 
of the g‘ and p; with the aid of (90.3). 

On differentiating (90.4) with respect to gf, we get 


#0. We next construct a function H(p,q) of the independent 


and since p; = L;; by (90.3), 
(90.5) Hy = —Ly. 


q 
Similarly, we compute 


T a’ aq’ 
H,.=9g° += 2—- Le, 
which on using (90.3) reduces to 


(90.6) H, = q’. 
But the Lagrangean equations 90.2 state that 
dLi _ La 
dt 


and, if we recall the definition 90.3 and formula 90.5, we obtain a set of n 
first-order equations, 
dp 


(90.7) F = — Ht i = 1, .., 1), 
which together with the n equations 90.6, 

dq’ 
90.6 — = a fms 22), 
[90.6] To | 


constitute the system of 2n first-order Hamilton’s canonical equations. 
The function H(p, q), known as the Hamiltonian function, has an im- 
portant physical meaning. Since L = T — V and V is a function of the 
q? alone, we can rewrite (90.4) as 
,OL , OT 
90.8 = ¢ ee — Gq, — 1 +. 
(90.8) ry: 1 agi 


However, T = 3a,,q'4', OT /0q' = a,q’, 


258 ANALYTICAL MECHANICS [CHAP. 4 
so that 


ee 


agi 
and hence (90.8) reduces to 
H=T+YV. 


Thus H is the total energy of the system. 
The variables 


(90.9) Bi = = sf’ 


are called the generalized momenta, and we note that the square of the 
magnitude of the vector p; is 
(90.10) P = app; = a”apapg g’ 
= a,q"q' = 2T. 
As an illustration of a simple use of Hamilton’s equation, consider a 
particle of mass m moving under the influence of a central force field with 
the potential V(r), r being the distance of the particle from the center of 


attraction. If we choose polar coordinates r = q}, 0 = q? as our generalized 
coordinates, then 


T= = [rè + (ri) = basii’, 


(aa) = 4 eal 


But H = T + V = ja™p.p, + V, by (90.10), which yields on inserting the 
values of the a”, 


where 


2 2 
H = PL 4 Pe 4 yr), 


2m 2mr* 
Thus 
ee 

OH _ Pe + y(n), OH _o, ae ga cH aii 

or mr 00 op; m Op, mr’? 
and hence Hamilton’s equations (90.6), (90:7) in this problem are 

dr_P dO _ p dpy _ Ps ` dp 

90.11 — = 7, a sie i OL y À aa i 
( ) dt m dt mr dt mr o) dt 2 


The last of these equations, combined with the second, yields 


TET = 
Ta 6) = 0, 


SEC. 91] NEWTONIAN LAW OF GRAVITATION 259 


which is a statement of Kepler’s second law of planetary motion. It is 
not difficult to show by using the remaining equations in (90.1 1) that if 
V = —m/r, the orbit is a conic section (cf. Sec. 97). 


v 


Problems 


1. If a particle of mass m is constrained to move on a smooth surface, show 
that the system of Hamilton’s equations is 
du* H dp* eH 
a op de a LD 
Ip u 
with p, = ma,,i° and H = (1/2m)a*ép, Prt A 
2. Show that along the dynamical trajectory dH/dt = 0, so that H = constant 
is an integral of Hamilton’s equations. 
3. Show that @L/@q* + @H]/aq* = 0. 
4. Write Hamilton's canonical equations for Problem 1, Sec. 89. 
5. If T = 3m(q)? and V = k(q}?, k > 0, show that H = p?/2m + mw*(q)*/2, 
where w? = k/m. Deduce that q = V 2h/mo® sin (wt + a). 
6. Deduce Hamilton's equations from the variational principle ôf L dt = 0. 
Hint: Write L in the form L = p,dq'/dt) — H(p,q), treat the variations of p 
and q as independent, and show that 


ii e oH) . oF agi 
yeo — (+5) "| =0 


91. Newtonian Law of Gravitation 


The general formulation of dynamical equations, outlined in the pre- 
ceding sections, imposes no specific restrictions on the functional form of 
the fields of force. In various applications of dynamics, including those 
of astronomy and atomic physics, we are concerned with the behavior of 
dynamical systems subjected to the action of central fields of force and, 
in particular, those fields whose intensity varies inversely as the square of 
the distance of the particles from the center of attraction. The inverse 
square law of attraction had its origin in Newton’s studies of motion of plan- 
etary bodies in what he termed?’ the “eccentric conic sections.” We state 
this law as follows: 

Two material particles attract each other with a force which is directly 
proportional to the product of their masses and inversely proportional to 
the square of the distance between them. The line of action of the force is 
along the line joining the particles. 

Thus the law, when stated in the form of a vector equation, reads 


MM 
E = y Fiz, 
3 
Vie 


18 Newton’s Principia, Book I, Sec. III, Propositions 1-17. 


260 ANALYTICAL MECHANICS [CHAP. 4 


where m, and m, are the masses of the particles and r} is the vector from 
P, to P,. The constant of proportionality y depends on the choice of units; 
in the cgs system its value is found to be 6.664 x 10-*, and its physical 
dimensions are M L3 T-?. In our work we shall make y = l, by a 
suitable choice of units of measure, so that 

mM» 


(91.1) F = a. 


Pie 

We observe first that the law of gravitation 91.1 refers to two particles, 
and, since in dynamics one usually deals with continuous distributions of 
matter, it is necessary to generalize it. Thus one can subdivide the bodies 
into small parts, replace each part by an equivalent material particle, add 
the forces corresponding to discrete particles, and pass to the limit as the 
number of subdivisions is increased indefinitely. This procedure for two 
bodies 7, and 7, leads to the formula 


(91.2) F= Í j PaP? fie dri dra, 


riz 
where dr, and dr, are the volume elements of bodies 7, and 73, p, and ps 
their density functions, and rj, is the position vector of dr, relative to d7,. 
We shall assume that p, and p, are piecewise continuous. 

Since two interacting bodies ordinarily give rise not only to resultant 
forces but also to resultant moments, it is necessary to verify that the 
generalized law of gravitation 91.2 reduces to the parent law 91.1 and 
yields no nonvanishing couples when the bodies 7, and 7, are allowed to 
shrink to a point. 

To show that this is indeed so, we introduce an orthogonal cartesian 
reference frame Y, and denote the coordinates of points of the bodies 
Tı and 7, by (y,') and (y,'), respectively (Fig. 41). We replace the distributed 
mass pı At, by the concentrated mass my, at P,(y,', y,?, y,°), and the mass 
Pz At, by m, at P(y2', Y, Yo"). 

In accordance with the law 91.1 we have, for the components of force 
AF’ due to these masses, 


AF* = pip: At, Ar, Ye — = d ’ 


and for the components of moments!® AL,, relative to the origin O, 
AL, = 5:41’ AF* 


l yt — y,* 
= j 2 
= 45.41 Pip2 AT, Ary 5 A 
r 
1? We recall that the moment of force F relative to the origin, acting at a point 
determined by the position vector r, is L=rxF or, in terms of components, 
L; = epy F*. 


SEC. 91] NEWTONIAN LAW OF GRAVITATION 261 


y3 


y! 


Fig. 41 


Adding these vectorially gives the resultant force 
(91.3) F? =f i Pipl Yo = Yı li dT, 
Ti r 


and the resultant moment 


(91-4) ne | Í e l n, 


We prove next that, as 7, and 7, are allowed to shrink toward P, and Py, 
respectively (or, even if 7, alone is allowed to shrink to zero) the resultant 
moment L, tends to zero and equation 91.3 specializes to the law in the 


form 91.1. 

We choose the origin O of the coordinate system at P,, and let 7, shrink 
toward O and 7, toward P(Y}, yo”, y2*). Since p, and p; in equations 91.3 
and 91.4 are nonnegative functions, the first mean value theorem for 
integrals is applicable and we obtain 


F = [e = ut) | | Pip2 47; dTa, 
Ip T1 T2 


Ma er 
L; = [en nind] Í P1P2 dt, dt2, 
r T1 ¥T2 


where brackets denote the values of affected quantities evaluated at certain 


and 


262 ANALYTICAL MECHANICS [CHap. 4 


points in 7, and 7ẹ} As the dimensions of 7, are allowed to approach 
zero, y; — 0, and hence L; > 0, whereas the first of the above integrals 


reduces to 
z 


a Ye 
F — TE mMıMo. 
fe 


This is precisely the law of gravitation 91.1 for two particles located at 
(0, 0, 0) and (yz, Yo", Y2°). 

It follows from the foregoing that a material body interacting with a 
point mass produces no resultant moment L. Moreover, direct calcu- 
lations show that this is also true when the point mass is replaced by a 
sphere r whose density p is a continuous function of the radius alone. The 
resultant force F, exerted by the body on the sphere, turns out to be the 


same as that produced by the body acting on a point mass m =e dr, 
located at the center of the sphere.”° 4 

Consider next a body 7 with piecewise continuous density p and let 
Py jy, y?) be a fixed point either within or outside 7. The gravitational 
potential V(P) at the point P due to 7 is defined by the integral 


1 2 3 
(91.5) V(P) =f me dr(é), 


where r = J (y! — E + (y? — EFP + (y? — &)? is the distance between 
Py’, y’, y?) and the variable point (&', €, $3) associated with the volume 
element d7(&) of +. The integral 91.5, as we shall presently see, defines a 
differentiable function V(y', y?, y*) for all locations of P. 

If P is outside the body, the integral (91.5) is proper and we can compute 
as many derivatives of V as desired by differentiating (91.5) under the 
integral sign with respect to the parameters y’. In particular, 


(91.6) — = —F,, 
where the F, are components of the gravitational force 
(91.7) F(P) = Í EN gy 

ile 


exerted by the body 7 on a particle of unit mass located at P(y). 


2 See, for example, I. S. Sokolnikoff and R. M. Redheffer, Mathematics of Physics 
and Modern Engineering, McGraw-Hill Book Co. (1958), pp. 410-411. 


Sec. 92] INTEGRAL TRANSFORMATION THEOREMS 263 


If P(y) is within 7, the integral 91.5 is improper, since r = 0 when the 
variable point (&1, &, £) coincides with (y', y*, y°). However, an improper 
integral may still be differentiated under the sign when the derived integral 
is uniformly convergent. In our case the uniform convergence of (91.7) 
follows from the familiar test on convergence of improper integrals.”? 
Moreover, it follows from the uniform convergence of (91.7) that F(P) is 
continuous throughout all space. 

Although V(P) is of class C” whenever P is exterior to 7, more stringent 
restrictions must be imposed on the continuity of p to ensure the existence 
of second derivatives of V(P) at points within 7. It is a fact that if p is of 
class C1, then the second derivatives of V(P) exist at all interior points of 7. 
A careful analysis of the difference quotients of the function F(P) shows, 
moreover, that?? V(P) satisfies the Poisson equation 


(91.8) VV = —4rp 
at all points within 7 and Laplace’s equation 
(91.9) V?y = 0, 


at points exterior to 7. 

Equations 91.8 and 91.9 imply that the second derivatives of V(P) in 
general suffer discontinuities whenever P crosses the surface & of 7. In 
Sec. 93, we establish the validity of (91.8) and (91.9) with the aid of Gauss’ 
flux theorem. A treatment based on Gauss’ flux theorem has the advantages 
of physical suggestiveness that do not appear in a purely analytic dis- 
cussion based on the aforementioned study of the difference quotients. 
However, it imposes quite severe restrictions on the character of regions 
and surfaces that bound the regions. The Gauss flux theorem is a theorem 
in the large, and it need not be used to deduce the local results 91.8 and 
91.9, which concern the properties of potentials in the neighborhood of a 


given point. 


92. Integral Transformation Theorems 


To provide analytic tools for our further study, we translate the well- 
known integral transformation theorems of Gauss, Green, ard Stokes in 
the language of tensor calculus. 


21 Since, for all values of (£*) in the neighborhood of (y$), |r" pEr? < A, ieee) <3), 
where 4 is a constant independent of (5*). For a discussion of this test see I. S. 
Sokolnikoff, Advanced Calculus, McGraw-Hill Book Co. (1939), pp. 367-372, or 
O. D. Kellogg, Foundations of Potential Theory, Springer-Verlag (1929), pp- 146-156. 

22 See O. D. Kellogg, op. cit., Chapter 6, pp. 146-156. 


264 ANALYTICAL MECHANICS [CHaP. 4 


Let F be a vector point function of class C1 in an open region 7 bounded 
by the regular? surface È and continuous in the closed region & + 7. We 
denote by n the exterior unit normal to = and state the divergence theorem 
in the form 


(92.1) [aiv F dr =| F -n do. 


The integral with the subscript 7 is evaluated over the volume 7, whereas 
the integral in the right-hand member of (92.1) measures the flux of the 
vector quantity F over the surface &. l 
We recall from elementary vector analysis that, in orthogonal cartesian 
coordinates, the divergence of F is given by the formula 
; Em aor 
(92.2) ivr a 
y y” ðy 
If the components of F relative to an arbitrary curvilinear coordinate 
system X are denoted by F“, then the covariant derivative of F” is 


; i 
F* SSS =- | x 

oJ kj 
and we observe that the invariant F} in cartesian coordinates reduces to 
the right-hand member of (92.2), and hence it represents the divergence of 
the vector field F. In addition, 


F-n = ¢,,F'n) = F'n,, 


and hence we can rewrite equation 92.1 in the form 


(92.3) | | F’, dr = [rn do. 


From this theorem two other theorems (usually attributed to Green) can 
be derived easily. 

Let u(x’, x, 2°) and t(x}, 2, x3) be two scalar functions of class C? in 7 
and of class Cin the closed region X + 7. We denote the gradients of u 
and v by u; and v,, respectively, so that 

u=% -and — 
If we set 
F; = uv; 


233 We omit a rather involved discussion of the properties of surfaces to which the 
divergence theorem is applicable. For a detailed treatment of this consult O. D. Kellogg, 
Foundations of Potential Theory, pp. 97-121. 


SEC. 92] INTEGRAL TRANSFORMATION THEOREMS 265 
and form the divergence of F’, we get 
F’; = g”F, ; = g” (uv, ; + viu). 


We insert this in equation 92.3 and obtain the desired formula 


(92.4) [eres + v,u,) dr = [uo do. 


The invariant gv, ; appearing in the left-hand member of equation 
92.4, when expressed in cartesian coordinates, is the Laplacian of v, 
dv/dy' dy’, and if we denote the Laplacian operator by the symbol V’, 
we can write 

giv, ; = Vo. 
Also the inner product g”v,u; can be written as 
g”v;u; = Vu.» Vo, 


where we use the customary operator V to denote the gradient. 
Hence formula 92.4 can be written in the familiar form 


(92.5) [uve dr = fun - Vv do —{v. - Vv dr, 
where i i 
Ov 
oo S S 
EAV = 


Interchanging u and v in equation 92.5 and subtracting the resulting 
formula from equation 92.5 yields a symmetrical form of Green’s theorem 


(92.6) [evr — vV*u) dr = I (u ——v 2u) do. 


Theorems stated in equations 92.3, 92.4, 92.5, and 92.6 are, perhaps, 
the ones most frequentiy used in mathematical physics. 
The Laplacian of v, 


(92.7) Vv = gv, 5, 


when written out explicitly in terms of the Christoffel symbols associated 
with the curvilinear coordinates x’ covering Es, is 


at Ob ("| 2 
2 ij —}, 
(92.8) V» = g z AM 


and the divergence of the vector F’ is 


a ok | i 
F’. = . P? 
(92.9) are re ji 


266 ANALYTICAL MECHANICS [CHAP. 4 


Formulas 92.8 and 92.9 can be written in different forms, which fre- 
quently are more convenient in computations. Equation 31.10 yields 


[31.10] fa = Z log Vg, 


and hence the divergence F P- in (92.9), can be written as 


OF ( ð z) ; 
t = el F’, 
ry tale 5 
or i 
: tga) 
; We ea 
(92.10) n 


If we set in this formula F’ = g“(dv/dz’), we get 


see! (g g” dv/dx’) 

Vg Ox 

We turn next to a consideration of Stokes’s theorem which permits us 
to express certain surface integrals in terms of line integrals. 

Let a portion of regular surface X be bounded by a closed regular curve 


C, and let F be any vector function of class C! defined on ÈX and on C. 
The theorem of Stokes states that 


(92.11) Vv = gva 


(92.12) Í n -curl F do = | F-Ads, 
= c 


where A is the unit tangent vector to C, and curl F is the vector whose 
components in orthogonal cartesian coordinates are determined from 


€i ê es 


oo 6% 
92.13 =|— — — 
( ) curl F dy dy? al’ 

p Fe -& 


the e; being the unit base vectors in a cartesian frame. The determinant 
in 92.13 can be written as a symbolic vector product V x F. 

We consider the covariant derivative F,; of the vector F; and form a 
contravariant vector 


(92.14) Gi = — "F, 


It is readily checked that in cartesian coordinates equation 92.14 reduces 
to 92.13, and we define the vector G to be the curl of F. 


Sec. 92) INTEGRAL TRANSFORMATION THEOREMS 267 


Since n» curl F = 1,G' = —e'’"F, „yn; and the components of the unit 
tangent vector A are dx'/ds, we may rewrite equation 92.12 as 


(92.15) -Í CE n da =f pe ds. : 
£ w as 
The integral |F. dx' is called the circulation of F along the contour C. 
JC 


Problems 


Í vn? do = |v dr, 
x T 


where v; = @v/ dx‘ is continuous on = and of class C? in 7. 
2. Show that 


(a) In plane polar coordinates with ds? = (dr)? + r?(d0)}?, 


IP arF,) aF 
TE] : + I, 


1. Prove that 


or 06 


y2 1f a ðv ð /1 av 
CR T Ek 


where F, and F, are the physical components of the vector F, that is, 
F = F,r, + F,9,, 


where r, and 6, are unit vectors. 
(b) In cylindrical coordinates with ds? = (dr)? + r?(d0)? + (dz)*, 


1rF,) 10F, ôF, 
Te 2 ie 


or r 06 az” 
r av 

— a 13v V 

Mie r ae 


where F = F,r, + F,8, + F,z, and ry, 0, Z, are unit vectors, so that F,, Fg» 
and F, are the physical components of F. ; l 
(c) In spherical coordinates with ds? = (dr)? + r?(d0)}? + r? sin? 0 (dg), 


1 a(r?F,) 1 sin 6F,) 1 oF, 


a ee or r sin 0 20 rsinð a’ 
ðv ðv 
ia i — 
1 a(r = | 1 a(sin £ z) 1 v 
T asine 00. | r®sin®6 ag?’ 


where the physical components of F are F,, F,, Fp so that F = rF, + 9,F, + 
$F, r,, 0, and¢, being the unit vectors. 


268 ANALYTICAL MECHANICS [CHAP. 4 
3. Show that, in an orthogonal curvilinear frame X, 


aa n = 
Ven ay V go a. V £3343 
1 fa) ð ð 


curl F = as 


Vg 11822833 ôx’ Ox? 
Vgu F! Vga F? Vag 


where the a; are the unit base vectors and F = F1a, + F’a, + F’ag. 
4. Show that the contravariant components of the curl of a vector F are: 


1 (= - OF, l (2 n) 1 L e 
Vg C ar? | Vg ax? aw! aN g ; A ae? ; 
5. Prove that under suitable restrictions on continuity the curl of a gradient 


vector vanishes identically. 
6. In orthogonal curvilinear coordinates, 


x 1 1 l 
Zii =g" = (0). i Æj, and sn ~ gi’ §22 ~ gee? 833 Tg 
If we set ds? = e,*(dx')? + e}? (dr?) + e} (d1), so that gu = 6f, Zoo = ez, 
833 = Oe then 


k 
(a) [ij, k] = 0, (e = 0 i, j, k distinct, 


ij, i ii Wes de; i a log e; 
l= —-lifl=esy, d G 
J) __ 2: %; i) _ Aloge; 
l; | g (e;)? axi’? f | <a (no sums), 


1 ð [ezez Ov @ (ese, W ð fejez Ov 
(b) V» = | + eh aaa E e 
ejezeg| Ox! e; at ax? \ e, dx? PUEN es Ox 
93. Theorem of Gauss. Solution of Poisson’s Equation 


In accordance with Newton’s law of gravitation, a particle P of mass m 
exerts on a particle P, of unit mass, located at a distance r from P, a force 
of magnitude F = m/r?. Imagine a closed regular surface © drawn around 
the point P, and let 6 denote the angle between the unit exterior normal 
n to È and the axis of a cone with its vertex at P. This cone subtends an 
element of surface do. (See Fig. 42.) The flux of the gravitational field 


produced by m is 
fr- Men =| eee dw 
r? cos” 


where do = r? dw/cos 6 and dw is the solid angle subtended by do. 


SEc. 93] THEOREM OF GAUSS 269 


Fig. 42 


We thus have 
(93.1) [endo =| m dw = 4rm. 
= T 


If there are n discrete particles of masses m; located within }, then 
n 
-co : 
mgs s% cos 6, 
. i=l F 
and the total flux is 


(93.2) fx endo = 4r > m,. 
i=1 


The result embodied in formula 93.2 can be easily generalized to continuous 
distributions of matter whenever such distributions nowhere meet the 
surface £. The procedure is a standard one. The contribution to the 
flux integral from the mass element p dr, contained within 7, is 


f £ -nao = | 28°04 ao, 
x x ip 


and the contribution from all masses contained entirely within & is 


(93.3) [F-nao =| (| ea) ao, 
x = T r 


where Í denotes the volume integral over all bodies interior to ©. Since 


all masses are assumed to be interior to £, r never vanishes, so that the 


270 ANALYTICAL MECHANICS [CuaP. 4 


integrand in (93.3) is continuous, and hence one can interchange the order 
of integration to obtain 


(93.4) f F-ndo =| (| | dr. 
5 T = r 


cos 0 do 


=— = 4n, since it represents the flux due to a 
r 


But the integral | 
2 


unit mass contained within &. Hence 


(93.5) [Fe mdo = 4È par = 4m, 
£ T 


where m denotes the total mass contained within 2. 

We can now state 

Gauss’s THEOREM. The integral of the normal component of the gravita- 
tional flux computed over a regular surface & containing gravitating masses 
wholly within it is equal to 4mm, where m is the total mass enclosed by È. 

This theorem can be extended to situations where È intersects the dis- 
tributed masses with sufficiently smooth density p. Let a regular closed 
surface È intersect a distribution of mass with continuously differentiable 
density p. We construct two surfaces X’ and X” parallel to È (cf. Sec. 73) 
such that X’ is interior to X& and È” encloses & (Fig. 43). The flux pro- 
duced by the gravitating masses varies continuously across &’ and &” 
when these surfaces, while remaining parallel, are made to approach È. 

Since X” does not intersect X, Gauss’s flux theorem can be applied to 
compute the total flux over £” produced by the masses within 2. Accord- 


ingly, 
(93.6) f (E : n); do = 4nm, 
m 


Fig. 43 


SEC. 94] GREEN’S THIRD IDENTITY 2 


where m is the total mass within & and the subscript / refers to the flux 
produced by the masses inside X. On the other hand, the net flux over 2’ 
produced by all masses outside & is 


(93.7) | Le =n) do 0, 


for the flux cone from any point outside 2 cuts &’ twice. 

Now, if we let &’ and &” approach £, the right-hand members in (93.6) 
and (93.7) do not change, whereas the left-hand member of (93.6) becomes 
the flux integral over È produced by the masses within X and the left-hand 
member of (93.7) represents the flux over È due to all masses exterior to >. 
Thus the total flux produced by a distribution of masses within È is 


(93.8) le on) do = 4am = f Amp dr. 


If we further suppose that F is continuously differentiable, we can apply 
the divergence theorem to the surface integral in (93.8), and get 


(93.9) Í (div F — 4rp) dr = 0. 


This relation is true for an arbitrary region 7 and, since the integrand in 
(93.8) is continuous, we conclude that 


(93.10) div F = 47p ~ throughout 7. 

However, formula 91.6 states that F = — VV, and thus (93.10) is equiv- 
alent to 

(93.11) VeV = —4rp. 


Thus at all points interior to the body 7, the gravitational potential 
satisfies the Poisson equation. We note in conclusion that formula 

dr 
[91.5] V(P) = Í oe 


r 


gives a solution of equation 93.11 at all points in 7. 


94. Green’s Third Identity. Harmonic Functions 


Green’s symmetrical formula 
ð 0 
[92.6] f (uV?v — vV°u) dr = f (u F 2m = do 


is applicable to any pair of functions u, v of class C? in the open region 7 
and of class C1 in the closed region © + 7. Let us set u = 1 jrandv = V, 


ie ANALYTICAL MECHANICS [Cuap. 4 


where r is the distance between the points P(z!, x, x?) and P,(é", £, £3), 
and V is the gravitational potential of a distribution of mass with con- 
tinuously differentiable density p, so that V is of class C? in 7. 

Since 1/r has a discontinuity at (x*) = (°), we delete P(x) from 7 by 
enclosing it by a sphere o of radius 6 and with center of o at P. Functions 
u = ljr and v = V then satisfy the conditions of theorem 92.6 in the 
region 7 — € bounded by È and o (Fig. 44). However, in the region 7 — «€, 
Vu = V?(1/r) = 0, and formula 92.6 yields 


(04.1) f tyvarr =| (ie yD) as 


on 
aa "ol 


where n is the unit exterior normal to the surface È + ø. Since, however, 
on o the normal n is directed toward P, 


ou [SrA arn f(t vA) a 
-J-j 
[egee 


= -af (2) dao ~ 4nV, 
Or /r=ò 


P(E) = 


Fig. 44 


Sec. 94] GREEN’S THIRD IDENTITY 213 


where V is the mean value of V over the sphere o, and w denotes the solid 
angle. 

On letting 6 — 0, the right-hand member of (94.2) yields —47V(P), and 
it follows from (94.1) that * 


7277 J 
(94.3) ves | -Yat Kla- f joe 55. 
An Jr r 4n Jz Onr 47 Jx= On 


The important formula 94.3, known as Green’s third identity, states that 
every function V of class C!in © + 7 and of class C*in 7, can be represented 
as the sum of three integrals appearing in (94.3). If V(P) is regular at 
infinity, that is, if for sufficiently large values of r, V is such that 


OV 


or 


where m is a constant independent of r, then on extending the integration 
in (94.3) over all space we get 


plac y 
(94.5) V(P) = 5 Í dr, 


T Jo r 


m 
— ’ 
r? 


(94.4) ivi <% and 
R 


provided that this volume integral converges. The surface integrals in 
(94.3), when extended over all space, vanish by virtue of the regularity 
conditions 94.4. 

At all points not occupied by matter (that is, where p = 0), the gravi- 
tational potential V satisfies Laplace’s equation 


(94.6) Vey = 0. 


A function satisfying equation 94.6 in a given region is said to be harmonic 
in that region. If V is harmonic in the region 7, formula 94.3 reduces to 
1 Í 1av 1 a(1/r) 

: ViP)=— | -~do-—|V 
OLD) D dn Jer an C Arje Ôn 
so that the values of V are completely determined in 7 when the values of 
V and of its normal derivative 0V/dn are known on x. However, these 
surface values cannot be specified independently of one another, and we 
shall see that the specification of the values of V alone on È fully deter- 
mines V(P) at all points of 7. On the other hand, the specification of avon 
on X determines V(P) in 7 to within an arbitrary constant, provided that 


OV 
—d 
= On j 


The condition 94.8 follows directly from the formula 92.6 on setting u = 1 
and v = V. Itis a necessary condition satisfied by every harmonic function. 


do, 


(94.8) =0. 


274 ANALYTICAL MECHANICS [CHapP. 4 


If = in (94.7) is the surface of a sphere of radius R, with center at P, 
then [0(1/r)]/an = [A(1/r)]/er = —1/r? and (94.7) gives 


1 
V(P) = Vdo, 
ae OO aa I, É 


when we note the condition 94.8. Formula 94.9 states an important 
property of harmonic functions: The value of a harmonic function V at 
the center of a sphere is equal to the mean value of V over the surface of that 
sphere. This property enables us to prove the following basic theorem on 
harmonic functions. 

THEOREM. A function V harmonic in a closed regular region & + 7 
assumes its maximum and minimum values on the boundary & of 7, with the 
single exception when V = constant throughout r. 

To prove this theorem assume that V takes on its maximum (or minimum) 
V, at some interior point P of 7. We construct a sphere S in 7 with center 
at P and of radius R, then 


Vdo 
Se 
PCP) = Re 
by (94.9). But the right-hand member of this expression is the average 
value V of V over S, and the average value V can equal the maximum Vo 
only if V = V, on S. Furthermore, since R is arbitrary, we conclude that 
V = V, at every interior point of S. To show that V has the same constant 
value V, at every point Q of 7, we connect P and Q by a curve € of finite 
length and cover it by a sequence of overlapping spheres with centers on C. 
Within each sphere of this sequence, V has the same constant value V, and 
hence V(Q) = Vo Thus, unless V = constant throughout 7, it takes on 
its extreme values on the boundary È. 

The determination of a harmonic function V in 7 from the specified 
values of V on the boundary È of 7 is known as the Problem of Dirichlet. 
If 7 is a finite region, we have an interior problem and when 7 is an infinite 
region bounded by a closed surface &, we have an exterior problem of 
Dirichlet. a 

It is easy to prove that the interior problem of Dirichlet for a regular 
region X + 7 does not have more than one solution. For, let there be two 
functions V, and V,, harmonic in 7 and which assume the same values on 
the boundary 2. But V = V, — V, is also harmonic, and it assumes zero 
values on X. This implies, however, that V = 0 throughout 7, since 
otherwise V would have to take on its positive maximum, or a negative 
minimum, in the interior. In the same way we can prove the uniqueness of 
solution of the exterior problem of Dirichlet if we suppose that V is 
regular at infinity. 


Sec. 95] FUNCTIONS OF GREEN AND NEUMANN 275 


The determination of a harmonic function V in r which satisfies on the 
boundary % of 7, the condition 


(94.10) ON agen), with Í f(P) do = 0, = 
on F 


is called the Problem of Neumann. 

Since V = constant is a harmonic function that satisfies the condition 
ƏV/ðn = 0 on È, we conclude that the solution of the Neumann problem 
(if it exists) is determined to within an arbitrary constant. It is possible to 
prove, although the proof is by no means easy, that the Dirichlet and 
Neumann problems are solvable for finite regular regions when the 
specified values on the boundary are continuous.”4 


Problem 


Show that formula 94.7 is valid in an infinite region 7 exterior to a closed surface 
= whenever V is regular at infinity. Hint: Apply formula 94.7 to a finite region 
bounded by = and by a sphere S of radius R so large that S encloses È. 


95, Functions of Green and Neumann 


We have just shown that the solution of the interior problem of Dirichlet 
in Laplace’s equation 


(95.1) 


en = 0 in 7, 
=f(P) ond, 


when it exists, is necessarily unique. Also the solution of the Neumann 
interior problem 


Vv = 0 in 7, 
Ca ci =g(P) ond, 
on 


with fg(P)do = 0, is determined to within an arbitrary constant when 
g(P)is continuous. To make the solution of the Neumann problem unique, 
we adjoin to (95.2) the normalizing condition 


(95.3) f» A010. 


When Laplace’s equations in (95.1) and (95.2) are replaced by the Poisson 
equations, we have the Dirichlet and Neumann problems in Poisson’s 


equation. 


24 See O. D. Kellogg, op. cit., p. 311. 


276 ANALYTICAL MECHANICS [CHapP. 4 


Formula 94.7 is not directly applicable to the solution of problems 95.1 
and 95.2, since it requires the knowledge of the values of the function and 
of its normal derivative on &. We show next how this difficulty can be 
avoided by introducing special functions that depend only on the shape of 
the region and not on the assigned boundary values f(P) and g(P). We 
begin with the Dirichlet problem. 

Let P(x) and P’(&) be a fixed point and a variable point, respectively, 
in 7 (Fig. 45). We construct a function G(P, P’) with the following 
properties: 


(a) G(P, P’) = : + W(P’), 


where r = PP’ and W(P’) is harmonic in 7. 
(b) GP; P= so on È. 


The condition (b) requires that 


w(P’) = — on È, 
r 
so that W(P’), and hence G(P, P’), is uniquely determined by properties 
(a) and (b). We call G(P, P’) Green’s function for the region 7. 

We show next how Green’s function can be used to construct an 
explicit integral formula solving the Dirichlet problem in Poisson’s 
equation 

Nie ae c 
(95.4) Vir. 4p in 7, 


V=f(P) on È. 


Fig. 45 


Sec. 95] FUNCTIONS OF GREEN AND NEUMANN 217 


The integral formula will include the solution of the boundary value 
problem 95.1 as a special case. 
Green’s symmetrical formula, 


[92.6] [uve = pV"u) dr =| (u as v 2) do, 
r x\ On On 
cannot be applied to u = G(P, P’), r = V, since G(P, P’) = 0 as P’ > P. 


If, however, we delete the point P by enclosing it in a sphere ø of radius ô, 
as in Fig. 44, the formula 92.6 is valid in the region 7 — «€ bounded by È 
and ø. We can write 


(95.5) Í (GV°V — VV°G) dr = f(c% - vc) do 
re r\ On on 


OV 2c) 
G — — V— | do. 
+ on on 


But in 7 — «e, G = l/r + w is harmonic, so that V’?G = 0 and G=0O0on 
E. Also V2V = —4rp by (95.4), so that (95.5) reduces to 


[cera —[ 72a [ (ts) 
(95.6) T p(é) pie te a Le w 


+Í pow t+ jo. 
o or 


In writing (95.6) we observed that, since n is an exterior normal, dV/dn = 


—dV/dr and ƏG/ðn = —OG/dr on o. 
Since dV/dr and w are continuous on g and do = r? dw, where dw is 
an element of solid angle, it is obvious that the second integral on the 


right in (95.6) tends to zero as ô— 0. Similarly, 


y” do = (0) as ô —> 0, 
o or i 
whereas 
Í L r? dœ = —4nV(P) as ô — 0. 
a r 


Accordingly, on letting ô — 0 in (95.6), we get 


aG 
(95.7) 4nV(P) = i 4nGp dt — f. võ do, 


278 ANALYTICAL MECHANICS [CHAP. 4 


which is the desired solution of the problem 95.4. If we set p = 0, we get 
the solution of the corresponding problem in Laplace’s equation, 


(95.8) V(P) = — a | ae do. 

4m JE On 
To apply this formula we must first obtain Green’s function G for the 
region 7, that is, we must solve the special Dirichlet problem 


V'w =0 in 7, 
w= — on ». 


Similar considerations are applicable to the Neumann problem 95.2. 
We introduce the Neumann function 


NGI) = — 4+ wih), 
if 


where w(P’) is harmonic in 7 and satisfies on the boundary È of 7 the 
condition?’ 


1 
1) + constant. 
T 


Computations entirely similar to those carried out previously for the 
Dirichlet problem yield for the boundary value problem 95.2, the formula 


V= Í EN do. 
4m JE 


Physically, Green’s function G(P, P’) can be interpreted as the electro- 
static potential in the interior of a grounded conducting surface © pro- 
duced by a unit charge at the point P. The potential produced by a unit 
charge alone is I/r, and w(P’) represents the potential produced by the 
induced surface charges on ©. Since = is grounded, G(P, P’) = 1/r + 
w(P’) = 0 on X. The Neumann function can be interpreted as steady heat 
flow from a source of strength 47 placed at P, when the heat flows across 
the surface X at a uniform rate. 


96. Green’s Functions for Semi-infinite Space and Spherical Regions 


A physical interpretation of Green’s function given in Sec. 95 enables us 
to construct Green’s functions for the half-space z > 0 and for the regions 
interior and exterior to the sphere. 


2 To make w unique we can normalize it by requiring | wdo = 0. 
~~ 


SEC. 96 GREEN’S FUNCTIONS 279 


Q (x, Y: —2) 
Fig. 46 


When a positive unit charge is placed at P(x, y, z) (Fig. 46) and a negative 
unit charge at the mirror image Q(x, y, —2), the electrostatic potential G 
produced by these charges at P’(, 7, ¢) is 


(96.1) Gases 3 

r iP 
where r = PP’ = V(E— x£} + (n — y}? + (6 — 2 
and 


r' = OP = V (E — £} + (n — y} + (E + 2}. 


Obviously, G = 0 on the plane z = 0, and since w(P’) = —1/r’ is harmonic 
for z > 0, equation 96.1 gives the desired Green’s function for the region 
pe 0: 
On the plane z = 0, 
C aie 124) 
on OC Ir-0 rat r? aido 
and after performing simple calculations and substituting in formula 95.8, 
we obtain the solution of the Dirichlet problem 95.1 for the region z > 0 
in the form of Poisson’s integral 


apre fn) ak an 
IE Aes IL F, r a 


280 ANALYTICAL MECHANICS [CHAP. 4 


The specified values f (£, n) of V on z = 0, must, clearly, be such that (96.2) 
has a meaning. 

A similar procedure enables us to construct Green’s function for 
the spherical region z? + y? + z? < R? and obtain the solution of the 
Dirichlet problem 95.1 for the sphere. 

We take P(x, y, z) in the interior of the sphere S of radius R (Fig. 47) 
and construct the image Q of P with respect to S, so that OP : OQ = R. 
Let P’ be a variable point in S. When P’ is on S, similar triangles OP”P 
and OP" @ yield the relation 


roS ee 
P"P OP r pi 
where p = OP. Thus 
l = ee ; on S 
par 
If, for any interior point P’, we define 
(96.3) GP pe 
Piper 


then (96.3) gives the desired Green’s function, since w = —(R/p)(1/r’), is 
harmonic in the interior of S and G(P, P^ = 0 on S. 
A simple calculation of dG/dn from (96.3) gives l 
dG R? — P 


dn ae Rr? a 


Fig. 47 


SEc. 97] THE PROBLEM OF TWO BODIES 281 


and formula 95.8 yields the solution of the Dirichlet problem 95.1 for the 
sphere in the form of the Poisson integral 


v 


1 R? TR 
(96.4) V(P =| P')-—F do. 
o 4r gi Rr’ j 
This integral is usually written in spherical coordinates (p, 0, $) as 


pr 27 2 2 j 1 1 
(96.5) Vip, 9, TA pan vao f AR? = pahub 
An Jo o (Rè — 2pRcos y + p?)” 


where cos y = cos 0 cos 6’ + sin 0 sin 6’ cos (¢' — ¢), 0 being the co- 
latitude and ¢ the longitude of P. 

Green’s function for the region x? + y? +2? > R is obtained from 
(96.3) by interchanging the roles of P and Q. 


Problems 


1. Show by using (96.4) that for every position of P in the interior of the sphere 


1 R? — p 


gens a dori: 


Hint: Take the z-axis along OP, so that r? = R? — 2Rp cos 0 + p°. For a fixed 
position of P, p is fixed. Express do = R? sin 0 d9 dọ in spherical coordinates 


and evaluate the integral. 
2. Show that the solution of the exterior problem 95.1 for the sphere S is 


given by 
a l Ran ae 
Abe Betray” Be) = 4, 
where p = OP > R. 
3. Deduce (96.5) from (96.4). Hint: Let y be the angle between OP and OP’ 
when P’ is on S. 


97. The Problem of Two Bodies 


The problem of two bodies can be stated as follows: Given a system of 
two particles interacting in accordance with the law of universal gravitation, 
what is the trajectory of the system? This problem was solved by Newton 
in the Principia, Book I, Sec. II. It lies at the basis of all considerations in 
astronomy. 

Since there is no particular advantage in using general curvilinear co- 
ordinates in specific problems, we refer our system to a set of orthogonal 


282 ANALYTICAL MECHANICS [CHAP. 4 


cartesian axes. We denote the coordinates of mass points m,, Ma (at any 
given instant of time t) by (#,', #,, 24°) and (a5}, x52, 25°) (Fige48). We 
also introduce another cartesian reference frame Y moving with the mass 
m, in such a way that m; is always at the origin O of the ¥-system, and the 
axes Y‘ always remain parallel to the axes X’. The coordinates of the mass 
point m, relative to the Y-axes, are denoted by y’, and we have the 
relations 


(97.1) OP = 2, = (a = 1, 2, 3); 
We choose the coordinates y’ of the mass m, as three of our generalized 


coordinates, and for the remaining three generalized coordinates we take 


those of the center of mass of the system. Thus 
(97.2) ui CE ay 
m, + m, 


Clearly, the u’ lie on the line joining the points (z,') and (x^), and our 
choice of the generalized coordinates is then as follows: 


g=y, g=y, P=, g=u, =u, qê = w. 


If we solve equations 97.1 and 97.2 for the 2,’ and z,’, we obtain 


oa = ue ae 2 y', 

(97.3) m a 
: : m 
Ta = u? ae T y’, 

mı + m 


and these equations enable us to determine the positional coordinates z’ 
in terms of the generalized coordinates g’. 


Fig. 48 


Sec. 97] THE PROBLEM OF TWO BODIES 283 


This particular choice of generalized coordinates is made with a view 
toward obtaining a simple expression for the potential energy V of our 
system of particles. Indeed, since the magnitude of the force of attraction 
F is given by F = mym,/r*, where r is the distance between the particles, 
the potential energy V is 


Mmm _ mM, 
le [(a," = oa T (x,° = ey + tz" = wey 


and it follows from (97.1) that the coordinates u’ do not appear in V, so 
that V is a function of y!, y?, and 4°. 
We recall the Lagrangean equations 


[86.11] 4 (21) oe 


and compute 


V 


T = 4m,2,'2,' + 4igty XQ 


1 T 1 mm ae 
= (m, + müi + —- —+-_y'¥’. 
> 1 2 Se y y 
Since OV/dgi = 0, for i = 4, 5, 6, an easy calculation makes equations 
86.11 reduce to 


era Ha OT i = 1, 2, 3), 
(97.4) m, + m: dy ( ) 


ii’ = 0, (iat 3), 


Equations 97.4 are the differential equations characterizing the motion of 
our system. We note first that the motion of the mass m, relative to m, is 
the same as though the mass m, were fixed and m, attracted toward it 
with a force whose potential is [(m, + m,)/m,|V. This follows at once 
from the first three of equations 97.4 if we rewrite them in the form 


m, + m,0V 


(97.5) mj = — 
m, oy 


Thus our problem is reduced to a study of motion under the action of 
central forces. The second set of equations 97.4 states that the center of 
mass moves in a straight line with constant velocity. 

We shall carry out the integration of equations 97.4 under the assumption 
that m, (the mass of the sun) is much larger than ms (the mass of the 
earth). If m, >> ms the center of mass uê will lie very close to the mass m, 
and hence the coordinates uê will nearly coincide with those of the mass 
m,. Thus 2,’ = u’, and from the second set of equations 97.4 we conclude 


284 ANALYTICAL MECHANICS [CHAP. 4 


y? 


Fig. 49 


that m, moves through space with constant velocity. Accordingly, we need 
to examine only the motion of mass m, relative to m. 


If m, > m,, 
mı ar Mes pte 1 
mı i 
and equations 97.5 become 
my = — a (approximately). 
y 


Let us suppose that our coordinate axes are so oriented that the motion 
of the mass m, relative to m, initially is in the Y'¥*-plane. Then, since the 
force field is central, the motion will remain in this plane, for there is no 
component of force at right angles to the plane. Let r and 0 (Fig. 49) be 
the polar coordinates of mass m, where 

y = r cos 0, 
y? = r sin 9; 
then the kinetic energy of mass m, is 
T = 3m{(,)? + (2)? 
= }m,(r? + 7°62). 
Using this expression for T, and V = —mym,/r, in the Lagrangean 


equations 86.11, with q! = r and q? = 6, we get”® 


for. — 2m 


’ 
r? 


MF — MF 
d 24 
— (mr 0) = 0, 
wl ar O) 


*6 We consider the force directed from m, to m,. 


Sec. 97] THE PROBLEM OF TWO BODIES 285 


or 


> ree 
(97.6) a a aes 
r°6 = h, 


where h is a constant of integration. 

Equations 97.6 are simultaneous ordinary differential equations for the 
determination of the trajectory. The second of these states that the 
sectorial velocities are constant. This is one of the Kepler laws.?? We can 
use the relation r?6 = h to determine the time required to describe the 
orbit. 

If h # 0, so that the trajectory is not a straight line, we can eliminate the 
time parameter t by noting that r? d6 = h dt, or 


o 
al r° dð. 
hJo 


Since df/dt = df/d6 - d6/dt, we have the relation d/dt = (h/r?)(d/d6), and, 
making use of this in the first equation in (97.6), we get 


td 2 T 
r? d0 \r?° dé Ay í 
or multiplying by 7*, 

d (h dr h? 
97.7) nà Z) -*+m=0. 
al) do\r?d0)— r - 


If we further change the dependent variable r in (97.7) by setting u = We 
we get a simple second-order linear equation 


uy 


na + u = , 
dé? h? 
whose solution is 


a : [1 — ecos(@ — «)], 


or 


l 
~ 1=ecos(0—a) 


(97.8) r 


where / = h?/m,, and « and e are constants of integration. 


2? See also an illustrative example at the end of Sec. 90. 


286 ANALYTICAL MECHANICS [CHAP. 4 


y? 
me 


> 


Fig. 50 


We thus see that the orbit is a conic section (Fig. 50) whose eccentricity 
is e, with the position of the apse line determined by «. The constant « is 
known as the perihelion constant. We shall not go to the trouble of 
determining these constants? in terms of the initial position and velocity 
of mass m,, since the main object of this section is to obtain formula 97.8 
for the purpose of comparing it with the corresponding equation of the 
orbit in the relativistic dynamics. 


238 See P. Appell, Mecanique rationelle, vol. 1, Chapter 11, and J. L. Synge and 
B. A. Griffith, Principles of Mechanics (1959),pp. 160-169. 


3 


RELATIVISTIC MECHANICS 


98. Invariance of Physical Laws 


The formulation of the fundamental laws of classical mechanics in the 
preceding chapter is based on the hypothesis that physical phenomena 
take place in a three-dimensional Euclidean space. It is also assumed that 
these phenomena can be ordered in the one-dimensional continuum of 
the time variable 7. The time variable f is regarded to be independent 
not only of the space variables x’ but also of the possible motion of the 
space reference systems. The mass m of a body is likewise supposed to be 
independent of the motion of reference systems, and, in particular, it is 
invariant with respect to a group of Galilean transformations of coordinates. 
By a Galilean transformation we mean a transformation that represents 
a translation of one coordinate system relative to another with constant 
velocity. Thus, if Y is a given cartesian frame, then a Galilean trans- 
formation of this frame has the form 


(98.1) g=aytut, (i= 1,2,3), 


where u’ is a constant vector representing the velocity of the origin of 
the Y-system relative to the cartesian system Y. It is supposed in (98.1) 
that the origins of the systems Y and Y coincide at the time ¢ = 0. 

From the linear character of (98.1), it is obvious that the accelerations 
d?y'|dt? and d®y‘/dt? of a particle referred to the frames Y and Y, respec- 
tively, have the same value. It follows from this that the force F acting 
on a particle has the same value F = ma in all reference systems moving 
relative to one another with constant velocity. In other words, Newton’s 
second law of motion is formally invariant relative to a group of Galilean 
transformations 98.1. 

Although the values of accelerations a’ are the same in all inertial 
systems,} the estimates of velocities differ in accordance with the formula 


(98.2) ü = v + u. 


1 See Sec. 76. 
287 


288 RELATIVISTIC MECHANICS [CHAP. 5 


Hence a statement of any law that depends on the velocity relative to a 
primary inertial system will not be formally invariant when expressed 
in a secondary system. Consequently, the fundamental laws of electro- 
dynamics and, particularly, of optics are not invariant with respect to a 
group of Galilean transformations 98.1, since these laws depend on the 
velocity of propagation of light. For this reason the primary inertial 
system has occupied a unique position in the theory of optics. In order 
to explain the observed fact of the independence of the velocity of light 
from the velocity of its source, and to imbed optics in the framework of 
analytical mechanics, physicists invented ether as a hypothetical carrier 
of light waves. This carrier was endowed with whatever physical properties 
were essential to ensure the same constant value for the velocity of propa- 
gation of light in all inertial systems, even when these properties did 
great violence to the established theories of elasticity and hydrodynamics. 
For instance, it was supposed that ether is an all-pervading, frictionless 
fluid that remains stationary relative to the primary inertial system, and 
that, when physical objects are forced to move through it, they suffer 
changes in shape, produced by elastic stresses that arise in a body moving 
in a quiescent fluid. It was then merely necessary to assume that the 
linear dimensions of measuring instruments suffer contractions depending 
on the velocity u’, these contractions being of precisely the right amount 
to make the velocity of light come out to be independent of the velocity 
of its source. 

A suitable formula expressing the dependence of the linear dimensions 
of a body on its velocity relative to a primary inertial system was developed 
by Lorentz, and a considerable body of the theory of relativity was 
phrased by him, in 1904, in terms of the quiescent ether. Lorentz’s 
mathematics appeared to fit well the observed results in the domain of 
electrodynamics and provided a simple explanation of a puzzling behavior 
of the electrical field of a moving spherical charge. but the physics of the 
situation still remained in great doubt. However, all experimental 
attempts to detect the existence of ether have led to null results, and, in 
1905, Albert Einstein achieved an explanation of the so-called Lorentz- 
Fitzgerald contraction by a sort of fiat which called for a profound 
revision in the prevailing notions of space and time. 


99. Restricted, or Special, Theory of Relativity 


In 1905, Einstein proposed two postulates, one of which relates to the 
formal invariance of physical laws, and the other epitomizes the results of 
certain remarkable experiments on the determination of the speed of light.? 

* A. Einstein, Annalen der Physik, 18 (1905), p. 891. 


SEC. 99] RESTRICTED THEORY OF RELATIVITY 289 


These postulates can be stated as follows: 


1. Physical laws and principles have the same form in all Galilean systems; 
that is, reference systems that move relative to one another with uniform 
velocities. 


2. The speed of light in free space has the same constant value in all 
inertial systems. 


In a sense there is nothing startling about these pronouncements since 
the ideas involved were in a state of ferment and discussion at the close 
of the nineteenth century and are quite explicit in the writings of Poincaré, 
Lorentz, Voigt, and others. But deductions to which Einstein was led 
from these postulates served to clarify and revise our concepts of space, 
time, and matter in a truly remarkable way. When viewed in the light of 
the fundamental laws of dynamics of a particle, the first postulate, as 
already remarked in Sec. 98, contains nothing novel. The laws of optics, 
on the other hand, are not invariant under the group of transformations 
98.1, and one can set out to modify them so as to achieve the invariance 
of the fundamental laws of optics as well as mechanics. One way of 
accomplishing this is to abrogate the hypothesis that the estimates of 
time ¢ are identical for observers located in two different Galilean reference 
systems. Mathematically this puts the time variable ż on the same footing 
with the space variables y’. 

Thus, let us suppose that we have two cartesian reference frames Y 
and Y, and an observer in the Y-frame recording the occurrence of some 
event at the point (y’) at the time ¢, by means of four variables (y', y?, o 
The four-dimensional manifold S, of the variables (y1, y?, y’, t) consists 
of E, and the range —œ < t < +. The same event is recorded by an 
observer in the Y-frame as a point (9', 7, 7°, 7), in Sa, where 7 is the 
estimate of time based on the clock in the Y-system of coordinates. As 
yet the variables (4t, y*, y’, t) and (9, 7, y, 7) are unrelated, but, since 
we are in search of coordinate transformations which preserve the laws 
of dynamics of a particle, let the word “event” mean the track of a 
particle moving in the Y-frame under the action of a zero force. The 
trajectory of such a particle in the Y-frame is a straight line, and we shall 
suppose that the motion of the Y-system relative to the Y-system is such 
that the trajectory in it also appears as a straight line. 

This hypothesis implies the invariance of Newton’s first law and re- 
quires that the variables (y', y2, 43, t) and (y', 9, Y°, 7) be related linearly. 
Thus 


F = ajy + at, (i,j = 1,2, 3), 
(99.1) 


290 RELATIVISTIC MECHANICS [CHAP. 5 


It follows from these equations that the origin of the system Y moves 
relative to the system Y with constant velocity. To see this, note that the 
coordinates of the origin O of the system Y are (0, 0, 0), and hence the 
trajectory of the origin O relative to Y is given by (99.1) as 

yx Ey qat, 

CE 
i = at. 

Hence dy'/di = a,*/a,4 = constant. 

It can be shown in a similar way that the coordinate planes move with 
constant velocity, so that the reference frames Y and Y are Galilean. 

Let us suppose next that a spherical pulse of light is sent out from the 
point P(y', y?, y?) of the system Y at the time 7. According to Einstein’s 
second postulate, light travels with constant speed c in all directions; 
hence in dt seconds a photon starting from the point (y^) will be at the 
point (y’ + dy’), and 
(99.2) dy’ dy’ = c? dt?. 


Relative to an observer located in the Y-system, the light pulse originates 
at the point (y', 7, 7°), and his equation for the spherical wave front, 
di seconds later, is 


(99.3) dy dy’ = è di?. 


Now if we substitute in (99.3) from (99.1) and compare the result 
with (99.2), we find that a particular set of equations 


g' = k(y* aa vt), 
au y’, 
(99.4) pay, 


where k = 1/1 — 6, B = vc, leaves the quadratic form 
(99.5) do = c? dt? — dy' dy’? 


invariant. These equations correspond to the circumstance when the 

system Y moves relative to Y with the velocity v along the Y-axis. 
Equations 99.4 are known as the Lorentz-Einstein equations of trans- 

formation.* We shall not launch into extensive discussion of their 


? We note that for the pulse of light do = 0 

* These equations have been derived in many different ways. See, for example, 
J. Rice, Relativity, p. 89; R. Tolman, Theory of Relativity of Motion; A. Einstein, 
Annalen der Physik, 18 (1905); Frank, Ignatowsky, and Rothe, Archiv fiir Mathematik 
und Physik, 17 and 18; J. L. Synge, Relativity: The Special Theory (1959), p. 69. 


SEC. 99] RESTRICTED THEORY OF RELATIVITY 291 


implications since most books on theoretical physics and special theory 
of relativity discuss them at great length, and there is no need to duplicate 
these considerations here. We shall mention only one example which has 
a direct bearing on the Lorentz-Fitzgerald contraction mentioned in 
See: 98: 

Consider a rod moving with the system Y. The end points of the rod 
have the coordinates (¥,', 0, 0), (¥;1, 0, 0), so that its length, as measured 
by an observer in the Y-system, is L = y! — g,'. Since J! = k(y.' — vt) 
and yı" = k(y," — vt), 

L= yf — y’ = Vi — PU — 91’). 

Accordingly, the estimate of the length L of the rod by an observer in 
Y-system is smaller than L in the ratio v1 — 62:1. Thus the observer in 
the Y-system concludes that moving objects suffer a contraction in length. 
The magnitude of this contraction is the same as that deduced by Lorentz 
and Fitzgerald in connection with their study of the electrical field of a 
moving spherical charge. Whereas Lorentz and Fitzgerald thought of 
their contraction as a “real contraction” produced by the passage of 
objects through a quiescent ether, in the foregoing calculation it appears 
as a property of the space-time manifold subjected to a transformation 
99.4, in which the space variables y' are such that an element of arc ds 
is given by the formula ds? = dy’ dy’. 

If instead of cartesian variables y’ we had chosen curvilinear coordinates 
x‘, related to cartesian coordinates y? by the formulas 


yf Veyron), 
then the form 99.5 would have read 
be ay" ZB 
2 = cdt? — gu dæ dx’ aea 
do cadi 815 dx dz’, Q RENE 
We note that the determinant of coefficients of this form has the value 
— eg. 
The foregoing formulas can be cast in a symmetric form by setting 
t= 2‘; then 
(99.6) do? = a,p dx* dxf, (a, p = 1,2, 3,4), 
where 
Ai; = Sij (i, j = iL, 2. 3), 
a T a end a= |a,~| = — °g 
If we now introduce a class of admissible functional transformations T 
in the four-dimensional manifold X, 


(99.7) T: Z = Tæ, 2, x°, 2f), («= 1,2,3,4), 


292 RELATIVISTIC MECHANICS [CHaP. 5 


and require that the form 99.6 be invariant under the class of transforma- 
tions 99.7, we can formulate the calculus of tensors as we did in Chapter 2. 


Problems 


1. Show, with the aid of equations 99.4, that events that are simultaneous 
from the point of view of an observer in the Y-system are not in general simul- 
taneous in the Y-system. 

2. Discuss the slowing down of moving clocks. 

3. Differentiate equations 99.4, and establish the relations between the com- 
ponents of velocity wë of a moving point, as measured by an observer in the 
Y-system, with the corresponding quantities wt measured in the Y-system. 


dy! w +v dy* Ww 

dt 1 + (B/c)w” dt ki + (B/c)w)’ 
4. With the aid of the formulas given in Problem 3, show that, if w and v are 

both less than c, then w/c < 1. Thus, if v = 0.9c, w = 0.9c, then w = 0.994c 

instead of 1.8c given by the usual law of composition of velocities. 


5. The expression arctanh w/c is sometimes called the rapidity. Show that 
the usual law of composition of velocities is obeyed by the rapidities. Thus 


Ans. (om) 


oO |= 


w v 
arctanh — = arctanh — — arctanh-. 
c c 


100. Proper or Local Coordinates 


Consider a point P whose space coordinates relative to some reference 
frame X are (xt, x’, x). Let the velocity of P, relative to this frame at 
the instant t, be v. We shall introduce a Galilean reference frame Y 
moving with the point P so that, at each instant ż, the point P is at rest 
relative to the system X. We shall call the system YX a local or proper 
coordinate system. 

Obviously the choice of local coordinate systems is not unique, since 
the definition just laid down merely requires that the velocity of the 
local frame be the same as that of the particle. This implies that the 
estimates of time (measured by the clocks carried in two different local 
coordinate frames) are the same. Hence the transformation from one 
local system X to another X’ has the form 


zt = 2, a, 2), 
i’ =f. 
The interval do is defined by the formula 
(100.1) do” = a,, dx* dz? 
= ê dt — g,, dx! dx’, 


Sec. 100] PROPER OR LOCAL COORDINATES 293 


so that 
doy dx dx? 

(100.2) (22) ye ee ee 
dt oe di 

ee: 


> 


where v is the magnitude of the velocity v of the point P relative to the 
X-coordinate frame. If a local coordinate system X is introduced at 
P, then, relative to X, v = 0 and equation 100.2 yields 


( ) 7 C 


in the local system. We define the Minkowski velocity vector u* by the 
formula 


[100.4] u* = —, («=1,2,3,4), 
o 


and observe that its components in a local system X are® (0, 0, 0, 1/0). 
Since a = |a,,| = —c’g # 0, we can construct the reciprocal tensor 
a*’, the Christoffel symbols 


1 (22a 2an, _ zes) 
i= = t ox” EN 


X y 
= a” [af, ô], 
(2) = atag, 0 
and define the operations of covariant and intrinsic differentiation as 


was done in Chapters 2 and 3. This permits us to define the Minkowski 
acceleration vector f* by the formula 


a 2y? B d 
(100.5) f%= Ou dx me dx 


da do? \Py) do do 
If our local reference frame YX is cartesian so that do? = c? di? — dy’ dy’, 
the components f* of the Minkowski acceleration relative to it are 


~ dy d(dg\di _1d (ay 
pate ee ale 


: (on bas — 1c): 


do?  di\do/do cdi\do 
Ler 
e di?’ 
t te 4 
5 For ice Oe ce cie = Pande — 
i do (edt? — gu dx dr) dt Ve—vt Vet—v 


v? = p? = 0, zê = t in a local system. 


294 RELATIVISTIC MECHANICS (CHAP. 5 
so that . 
| ee ( =4,2, 3), 


fi=0, since x* =7. 


We shall show next how Newton’s second law can be written in an 
invariant form relative to all Galilean reference frames. Consider the 
formula suggested by Newton’s second law, 


EE E E 
ôo 


where u* = dx*/do is the Minkowski velocity and m, is a constant whose 
significance will appear presently. Now 


ô dt 
F* = — (mu) — 
A j rE 


Vc? — v2 ot do: 
(me) 
fon POL ics =p de 


ee 
edi prdt\./1 — p* at 
where we made use of the relation 100.2, and set f = v/c. If we define 


Mo 
vi- Bp 


the foregoing equation can be written in the form 


m = 


T 16 dx* 
(100.6 E ipe L Diea | 
) p COA at 
and since, in the local coordinate system Y, 8 = 0 and m = mọ, 
d*y* 
(100.7 ras 
) È d? 
Sme 


This has the form of Newton’s second law used in classical mechanics. 
We see that the invariant mọ is the mass of the particle P referred to a 
local reference frame. It is called the rest (or proper) mass of the particle. 


Sec. 101] EINSTEIN’S ENERGY EQUATION 295 


Since equation 100.7 is a tensor equation, we can write the force equation 
as 
Fa = Mito fs 
which is valid in all Galilean reference frames. # 
We shall rewrite (100.6) in the form 


(100.8) gr = (2), 

ét\./1 — E 
where v® = dx*/dt, and F* = c2V1 — pe F”, and shall take it as the 
equation of motion of a particle in the restricted theory of relativity. 


101. Einstein’s Energy Equation 


We conclude our sketch of the rudiments of mechanics in the restricted 
theory of relativity by establishing an important connection between mass 
and energy. 

For simplicity in writing we suppose that the coordinates x used in 
this section are rectangular cartesian; and we recall that the work done 
by the force PGs); in producing a displacement dz‘ is equal to 
the change in the kinetic energy. Indeed, the classical theory gives 


T— =| mv dv =| ee a(=) 
vo vo dt dt 


t i q2 nt 
t 


If we take as our definition of the kinetic energy in the restricted theory 
of relativity the expression 
(101.1) T= p7 dz 
and insert for the F , from equation 100.8, we get® 
T=| Fae! =|" Z 5) 


Po padt 1- 


: 1 (ee, aaa 
-mjii rae 
wldt\/1 — pe) dt dtJ1— pdt 
6 Since the reference frame is cartesian, the intrinsic derivative reduces to the 
ordinary derivative. 


296 RELATIVISTIC MECHANICS [CuaP. 5 
But 


hence v'(dx'/dt) = fc?, and BB = (v'/c?)(dv'/dt). Substituting these ex- 
pressions in the integral, we get 


ae FA reer = —)i “a o = —| Á 


2 (* BÊ 
r err 4 


e E 
apt 


=n} 


2 
T= E > + constant. 


(1 — p 


If we wish to have T = 0 when $ = v/c = 0, the constant of integration 


is —m,c?, so that 
| (MMM eee 2 
Siler py me 


= (m — m,)c’. 


I 


Thus 


Thus 
R 
(101.2) m= mo +2 


We see that the mass m depends on the kinetic energy. If this result is 
assumed to hold in dissipative systems, then the decrease in mass m must 
be accounted for by the loss of energy by radiation.’ 

We see from the foregoing that the principles of conservation of energy 
and conservation of mass, which appeared to be quite distinct in the 
classical theory, can be united into one law in the restricted theory. 
We also see from equation 101.2 that, if a particle takes up an amount 
of energy AT, then its inertial mass m is increased by an amount AT/c?. 

"In vol. 41 (1935) of the Bulletin of the American Mathematical Society, Einstein 
gave an elementary derivation of this mass-energy relation by basing his considerations 


on the principles of conservation of energy and momentum. For a definitive treatment 
see J. L. Synge, Relativity: The Special Theory (1956), Chapter VI. 


Sec. 102] RESTRICTED THEORY 297 


Thus the inertial mass m can be considered a measure of the energy of 
the particle, and the law of conservation of mass holds if, and only if, 
the particle neither receives nor gives up its energy. Einstein associated 
with every mass m an amount of energy E = me. Then equation 101.2 
can be written in the form 

E=me +T, 


in which mc? appears as the intrinsic energy and T as the kinetic energy. 


102. Restricted Theory. Retrospect and Prospect 


In our development of mechanics in the manifold of the special theory 
of relativity we maintained the distinction between the space coordinates 
xi, (i = 1, 2, 3), of a particle and the time variable t = xê. The metric 
of the space was assumed to be Euclidean. The novel features of the 
theory lie in the abandonment of the concept of universal time and in 
the demand that the mass of the particle change with velocity in a pre- 
determined way, if the Newtonian law of motion is to be invariant with 
respect to a group of Lorentz-Einstein transformations. 

The distinction between the space and time variables can be suppressed 
by introducing a single-valued reversible transformation of the S, mani- 
fold, 

Z" = (a), x°, 2°, 2°), (« = 1,2, 3, 4), 


where the coordinates 7* are quite analogous to the generalized coordinates 
of analytical mechanics. We suppose that our space S, is so metrized 
that the quadratic form 


(102.1) do? = a,, dz” dxf 
reduces to 
(102.2) do? = è dt? — dy’ dy’, 


when the space coordinates «’ are orthogonal cartesian. Since the coeffi- 
cients in the form 102.2 are constants, it follows that the Riemann curva- 
ture tensor R,»,5 of the Sy manifold vanishes, and hence the geodesics 
in S,, determined by 


2a B 7 
(102.3) a ies ge 


do? 4 By) do do 


are straight lines. 

We note, with reference to equations 100.5, that equations 102.3 
characterize the motion of a particle in the absence of acceleration f”. 
This suggests the possibility of interpreting the trajectories of particles, 


298 RELATIVISTIC MECHANICS [CHaP. 4 


subjected to the action of nonvanishing forces, as geodesics in some 
manifold of the variables x for which the curvature tensor does not 
vanish.* Physically, this corresponds to the introduction of accelerated 
reference frames moving in such a way that the forces acting on the 
particles vanish. If this is done, the concept of force need not enter 
dynamics, and dynamical trajectories can then be viewed as geodesics 
determined. by the metric properties of space. 

In the remaining section of this chapter we discuss the problem of two 
bodies from a general relativistic point of view. This portion of the 
general theory of relativity was developed in the early 1920’s, and its 
mathematical elegance and success in explaining the advance of the peri- 
helion of Mercury gave hope that the time when all mathematical physics 
would be imbedded in the framework of the general theory of relativity 
was not too far away. However, the researches of the following two 
decades make it appear unlikely that general relativity will prove useful 
in the domain of microscopic physics, because of the failure of the theory 
to unify mechanics and electrodynamics. It is likely that the future 
usefulness of the theory will be in whatever stimulus it may provide to 
speculations in cosmology. These remarks do not detract from the pro- 
found effect which Einstein’s paper,’ setting forth the foundations of the 
general theory of relativity, had on the revision of the concepts of space, 
time, and matter. 


103. Einstein’s Gravitational Equations 


In order to conform to the usual notation in books on general theory 
of relativity, we denote the metric coefficients of the four-dimensional 
relativity manifold by g,,(x', x’, 2°, xt), and write the fundamental quad- 
ratic form as 


(103.1) ds? = g; dx dr, (i,j = 1, 2, 3, 4). 


In the special instance of the restricted theory the form 103.1 can be 
reduced by a suitable transformation to the canonical form 


(103.2) ds? = (di)? — dy‘ dy’. 


* A similar situation arose in classical mechanics (Sec. 84), where we introduced a 
Riemannian manifold, with the arc element dS of the form 


dS = V 2m(h = Vig 35 dx? dz’, 


in which the trajectories are geodesics. 
° A. Einstein, Annalen der Physik, 49 (1916), p- 769. 


Sec. 103] EINSTEIN’S GRAVITATIONAL EQUATIONS 299 


Our hypothesis is that the coefficients g,,, which we will term potential 


functions," can be so chosen that the trajectories of particles satisfy 
the equations of geodesics, 


(103.3) D y [ee Wo 
ds* jk} ds ds 
The Riemann curvature tensor R',,, associated with the manifold of 
restricted theory, vanishes, and the rectilinear geodesics of the manifold 
correspond to the trajectories of particles in the absence of a gravitational 
field. Consequently, if the manifold with the quadratic form 103.1 is to 
account for nonrectilinear trajectories, the Riemann curvature tensor must 
not vanish. We assume, with Einstein, that the field of a large gravitating 
mass (the sun) is such that the potential functions g,; satisfy in vacuum 
the equations 
Gj = Rf — 36;'R = 0, 


where G; is the Einstein tensor defined in Sec. 38. If we contract G. 
we get the equation R — }4R = 0, so that R = 0. Accordingly, 


. (103.4) R,, = R4 = 0, 


where R,; is the Ricci tensor. These equations include the flat manifold 
of restricted theory and admit the case for which the components of the 
curvature tensor do not vanish. 


Equations 103.4 are analogous to Laplace’s equation, gV, ; = 0, of 
Newtonian potential theory, which is valid at all points outside gravitating 
matter." 

We recall’? that the Ricci tensor R, appearing in the left-hand member 
of equation 103.4 is given by 


eE pi- Uy 


where we write |g] since the determinant of the form 103.1 may be negative. 


> 


10 This terminology can be justified by examining the form of the coefficients in 
equation 84.9 in a related problem in Newtonian mechanics. 

11 This equation is suggested by a chain of reasoning making use of equations of 
motion in the form G,“ = 0, where G? = — puw with ub = dx'/dt. A delightful 
account of this approach is contained in G. Y. Rainich’s, Mathematics of Relativity 
(1950). See also Problem 2, Sec. 38. 

12 These equations are not independent, and it can be shown that there are four 
relations connecting them. See, for example, A. S. Eddington, The Mathematical Theory 
of Relativity, 2d ed. (1924), p. 115. This fact, however, has no bearing on the calculations 
given below. 


300 RELATIVISTIC MECHANICS [Cuap. 5 


It is obvious from the foregoing that the system of ten nonlinear partial 
differential equations (see Sec. 38) 


R; = 90 


for the ten unknown functions g,; is extremely complicated. The general 
solution of this system is not known, and one is obliged to seek particular 
solutions, essentially by trial, and use Newtonian mechanics as a guide 
in selecting sensible forms for the coefficients g,;,. Once a set of g;;'s 
satisfying equations 103.4 is found, we can form the equations of geodesics 
103.3, and if the solution of equations 103.3 agrees to the first order of 
small quantities with the corresponding situations in Newtonian theory, 
all is well. 

We shall illustrate this procedure in Sec. 104, where we will obtain the 
Schwarzschild" solution of the gravitational equations 103.4. 

Before we proceed to that topic we note that equations 103.3 can be 
written in a neat form, 


(103.5) wig) = 0, 


where a’ = dzx'/ds. If we regard the vector dx'/ds = 4’ as the tangent 
vector, then equations 103.5, or rir = 0, are precisely the equations for 
the parallel displacement of the tangent vector 4‘ along a geodesic. Our 
problem has thus been reduced to the solution of a deceptively simple- 
looking system 

R,; = 0, 


wae — (); 


with which we will occupy ourselves in Secs. 104 and 105. 


104. Spherically Symmetric Static Field 


We proceed to deduce a solution of Einstein’s equations 
(104.1) R,; = 0, 


13 It is interesting to note that as an argument for adopting this system of equations 
as the law of gravitation it is frequently stated that the law 103.4 represents a simple 
relation involving the curvature tensor R',,, and hence a desirable one. A skeptic 
might feel that the Creator was not greatly concerned with the simplicity of mathematical 
physics. 

14 K, Schwarzschild, Berlin Sitzungsberichte (1916), p. 189. See also some important 
special solutions in G. D. Birkhoff’s Relativity and Modern Physics, pp. 219-227. There 
is also the solution of H. Weyland T. Levi-Civita, corresponding to rotational symmetry. 
See P. G. Bergmann, Introduction to the Theory of Relativity (1942), pp. 206-210. For 
a comprehensive discussion of spherically symmetric fields see a treatise by J. L. Synge, 
Relativity: The General Theory (1960), Chapter VH. 


Sec. 104] SPHERICALLY SYMMETRIC STATIC FIELD 301 


for the gravitational field produced by a spherically symmetric mass 
particle, which will be shown to correspond to the gravitational field of 
the sun fixed at tne origin of our reference frame. In obtaining this 
solution we will be guided by the properties of the Newtonian gravitational 
field and by the form of the corresponding solution in classical mechanics. 

The discussion of the two-body problem in Sec. 97 suggests that we 
adopt as our reference frame a system of coordinates which at great 
distance from the gravitating mass specializes to the ordinary spherical 
coordinate system. Moreover, since the field is spherically symmetric, 
and since the metric of the manifold is determined by the field, the metric 
tensor g, must be spherically symmetric. Thus we shall select the coor- 
dinates in such a way that, at great distance from the center of attraction 
(the origin), 


io = z = Gh a = Ge se 


where r, 6, ¢ are the usual spherical coordinates. 

The trajectories of particles far away from gravitating matter should 
be straight lines, so that Rj,, = 0. We write the limiting form for the 
space-time interval as 


(104.2) ds? = (dt)? — (dr)? — r°(d6)2 — r° sin? 6(d$)?, 


where we have adopted a new unit for the velocity of light c so that 
it is 1. This leads us to assume that, in the presence of a spherically 
symmetric static gravitational field, 


(104.3) ds? = fide)? — falar? — 7(d6)? — r° sin? (dgy, 


where f, and fz are unknown functions of r, each reducing to unity when 
r is increased indefinitely. 

The cross-product terms dr d0, db d0, etc., are omitted in the form 
104.3 since ds? must be independent of the signs of d0 and dọ because of 
the spherical symmetry. Likewise, we reject the cross-product terms 
involving dt, since we assume that the field is static and reversible in time, 
and hence must be independent of the sign of dt. Our procedure in 
determining the functions fı and fa will be to insert the expressions for 
metric coefficients g,; from (104.3) in the gravitational equations 104.1, 
and use equation 104.2 as a boundary condition at infinity. 

For the purpose of calculating f, and f it is convenient to set 


Ve = e}, f = e, 
where A and u are functions of r. Since effects of the gravitational field 
diminish as r— œ, the functions A and u must tend to zero when r 
increases indefinitely. 


302 RELATIVISTIC MECHANICS [CHAP. 5 
We can write the form 104.3 in the new notation as 
(104.4) ds? = —e*(dr)? — r°(d6)? — r° sin? 0(d¢)” + eX(dt)’, 
so that the metric coefficients g;; are 
81 = —e", ga = r, BaS —r’ sin® 0, Za = e" 
Zi; = 0, i A j. 
The determinant g of the quadratic form 104.4 is 
g = 11822833814 = —e "rt sin’ 0, 


and the contravariant tensor g” is given by the matrix 


=a 0 0 0 
0 — = 0 0 
ij r 
(g”) = 1 
0 0 — ———_ 
r? sin? 0 
0 0 0 e? 


In order to form equations 104.1, we construct the Christoffel symbols 


( and, since g;; = 0 when i Æ j, we have 


WEE (22a 4 Sn _ dgu 


ij] 2° Vr 3s axe 


o sum on k). 
> ) (no sum on k) 


It is easy to verify that distinct, nonvanishing Christoffel’s symbols are 


TE mM 
il 12) r By 


4 ; 1 = (3) 
= > SS 3 = 0, 
‘i łu fol ği A 


1 aid | 2 | | 1 | a 
= —r sin“ 0 A =e 6 6, — Lora A 
a e 33 sin 0 cos = te""“u 


where primes denote the derivatives with respect to r. 
We can now insert these symbols in the formula 


no EG) 


Sec. 104] SPHERICALLY SYMMETRIC STATIC FIELD 303 


and obtain after tedious but simple calculations the following set of 
differential equations: 


(1045) Ry = aw” — Hy’ + RW —* =O. 


: : 
(104.6) Ree esl inte — 2')| =a — 0, 
(104.7) Ras = sin? O{e*[1 + 4r(u’ — 4’)] — 1} = 0, 


(1048) Ry =| yw" + a’ = - | = 0, 


Ree 0,0 18 int j- 
Equation 104.7 in this set is a mere repetition of equation 104.6. We 
thus have only three equations on A and yu to consider. 
From equations 104.5 and 104.8 we deduce that 
ae = —p, 
so that 
= —yp + constant. 
However, as r—> œ, A and:y tend to zero; hence, 
Ar) = — y(r). 
Epuation 104.6 thus becomes 
(104.9) e(l + rw’) = 1. 
We set 
et = 7, 
and equation 104.9 becomes 
y +ry =l. 
Integrating this first-order linear equation, we get 


2m 


(104.10) =e", 
ie 


where 2m is a constant of integration. We shall identify m, in Sec. 105, 
with the mass of the sun. 

It is easily checked that the solution just obtained satisfies all equations 
in our system. Inserting e~ = e“ = y in equation 104.4, we get the 
desired quadratic form 


(104.11) ds? = —y™(dr}? — (6)? — 1° sin? (dg)? + y(at)?, 


where y = 1 — 2m/r. If the constant of integration 2m vanishes, y = 1, 
and the resulting manifold is the flat manifold of restricted theory. For 
m # 0, the manifold is curved. 


304 RELATIVISTIC MECHANICS [CHaP. 5 


The reader may feel uneasy about the Schwarzschild solution of 
Einstein’s gravitational equations, since it was obtained on the basis of 
several fortuitous guesses with one eye cocked on results of the classical 
theory. He may feel that a different mode of attack might yield a different 
solution. That this is not so was shown” by G. D. Birkhoff, who demon- 
strated that all spherically symmetric static solutions of the gravitational 
equations R; = 0, which yield a flat metric at infinity (that is, the one 
characterized by equation 104.2), are equivalent to the Schwarzschild 
solution. Thus the solution obtained previously is of interest because 
it is the only static solution of our equations satisfying specified boundary 
conditions at infinity. 


105. Planetary Orbits 


We are in a position now to determine the trajectory of a particle 
moving in a spherically symmetric static field determined by the quadratic 
form 104.11. The trajectory of the particle is a geodesic, so that we have 
to solve the set of equations!® 


d*x* | i \ dz” dz? 
ds? apl ds ds 


where s =r a =a E 
Making use of the table of values of Christoffel’s symbols given in 
Sec. 104, we find that for i = 2, for example, we have the equation 


ata 2a (pl 


be 


ds | ip 


> 


ds? 12) ds ds \21f ds ds \33 
or 
d0 2drd0 _ ,(ddy 
105.1) oj ee ree 6(“2) v0. 
( ds? rdsds DETE ds 


In a similar way we form equations for i = 1, 3, 4. The results are 


a 2 2 4 2 J 2 
(105.2) d eili A — yr (<2) — yr sin? 0(22) + zdz (#) — 


ds? 2y dr\ds/~ ds ds 2 dr \ds. 
2 
(105.3) Gg, rL cote E, 
ds“ rds ds ds ds 
2 
(105.4) dt _ldydtdr = 


ds y drds ds 


*° G. D. Birkhoff, Relativity and Modern Physics, p. 253. 
*® For an elegant treatment of planetary orbits by means of Lagrangean equations 
see J. L. Synge, Relativity: The General Theory (1960), pp. 289-298. 


Sec. 105] PLANETARY ORBITS 305 


The last of these equations can be written 


dt, ldydt_y 
d? ydsds ` 
or x 
d| dt 
(105.5) A 2e) = 0. 
ds ve 


We will prove that the analytic solution of equation 105.1, satisfying 
the initial condition d@/ds = 0, when 0 = 7/2, is O(s) = 7/2. Since 
d6/ds = (d6/dt)(dt/ds), and dt/ds # 0, this is equivalent to showing that 
the trajectory of the particle lies in the plane 6 = 7/2, provided that the 
initial component d6/dr of the velocity, in the direction of increasing 9, 
vanishes. We thus assume that the solution O(s) can be represented by 
the series 

d0 ONS 
GG o a Geha * 
Since dðjds = 0, when 0 = 7/2, equation 105.1 for 6 = 7/2 gives 
(d?6/ds”), = 0. 

To obtain (d°6/ds*), we differentiate equation 105.1, and insert in the 
result the values 9 = 7/2, d0/ds = 0, and d?6/ds? = 0. We find d°6/ds* = 0. 
In this manner we can show that 6(s) in (105.6) is 0(s) = (8) = 7/2. 

The corresponding result in the Newtonian case is obvious since, under 
the assumption of the central field of force, there can be no component 
of force at right angles to the plane of motion. Thus, if the motion had 
once started in the plane 0 = 7/2, it would continue in that plane. If 
we insert the solution 0 = 7/2 of equation 105.1 in equation 105.3, we get 


2 
d$ ,2drd _ 9, 


105.7 ; 
( ) ds? rdsds 
and integrating equations 105.5 and 105.7 we obtain 
(105.8) ia diga : 
ds 
(105.9) LEEA 
ds 7 


where a and h are arbitrary constants. 
Substituting in equation 105.2 from 105.8 and 105.9, and using the 


previously found solution 6 = a/2, we have 


der 1I dy (z) (: ) y dy (2) 
05.10 oe e w a eel es ee 
ae ds? 2y dr\ds i r 2 dr\y 


306 RELATIVISTIC MECHANICS [CHAP. 5 


The expression for (dr/ds)*, appearing in equation 105.10, can be obtained 
from formula 104.11 by using equations 105.8 and 105.9 and 0 = 7/2. 


We have 

ary Pee 

LLE a 

ie ta : 
which, upon insertion in (105.10), gives 

du im = 3m) 

105.11 — e E a l — —I, 
( ) dse or tae r 


since y = 1 — 2m/r. But 


ir te dg) ae (de) | BE 


ds ddds’ dst ddé*\ds) dd} 
rh _ 2h (=) 
dẹ rt r \dd/’ 


where we made use of equation 105.8. 
Thus equation 105.11 can be written in the form 


2 42 2 2 2 
(105.12) n -*(2) +e F a 
rde? rr \dd ee r 
If we introduce a new dependent variable u = 1/r, 
E E 
do urdd dd? u®\dd) udp 
and equation 105.12 reduces to 
d'u m 
(105.13) de? t+u= p? + 3mu?, 
Equation 105.13 together with equation 105.8, which we write as 
(105.14) ee. 
as oP 


suffices to determine the trajectory. 
It is interesting to write down here the corresponding equations of 
the classical theory obtained in Sec. 97: 


du = em 
(105.15) dg’ he” 
dọ _h 


Sec. 105] PLANETARY ORBITS 307 
where we write ¢ for the angular variable 0 used in that section and 


introduce the gravitational constant k = 6.7 x 10° and 


m, = 1.98 x 10% gr 


v 


is the mass of the sun. Because of our choice of units for the velocity 
of light, we note that far away from gravitating matter 


ds? = (dt)? — dy’ dy’, 


2 iq, 
(2) a1 - Sat — ot 


dt) dt dt 


so that 


For planetary velocities, v is very small compared with the velocity 
of light, which we took to be 1, so that to a high degree of approximation 
ds = dt. Thus, in both classical and relativistic sets of equations, h can 
be interpreted as the sectorial velocity. The constant of integration m 
corresponds to km, so that the relativistic equation 105.13 differs from 
the corresponding classical equation only in the appearance of the term 
3mu?. 

Now, the ratio of 3mu? to m/h? is 3h?u?, or using equation 105.14 it 
is 3(r db/ds)2. For ordinary planetary speeds this ratio is small. For 
example, the average radius of the earth’s orbit is r = 1.5 x 10% cm, 
the angular velocity dġ/dt = 2- 107” rad/sec, and, if we take as a first 
approximation dt/ds = l/c, we find the value of 3r?(dd/ds)? to be of the 
order 10°. 

Consequently, in ordinary planetary motion “‘the correction term” in 
the relativistic equation 105.13 is negligible, as far as the shape of the 
orbit is concerned, but the influence of this term on the behavior of the 
perihelion, as will be seen in Sec. 106, is significant. 

It will be shown in the next section that the perihelion rotates through 
an angle 6m?z/h? rad during each revolution. This value proves to be too 
small for all planets in the solar system with the exception of Mercury, 
for which it corresponds to nearly 42” of arc per century. This advance of 
the perihelion of Mercury has found no satisfactory explanation on the 
basis of the Newtonian theory, and we will see that the calculations based 
on the relativistic equation 105.13 give results which agree extraordinarily 
well with observed values. 

We conclude this section by remarking that, if the foregoing calculations 
were performed with the quadratic form 


ds? = c*y(dt)* — E — r°[(d0)? + sin? 0(d¢)"] 


308 RELATIVISTIC MECHANICS [CuHapP. 5 


as a basis, we would have arrived at the equation” 


km, , 3km,u? 
i 1 = 2 , 
where m, = 1.98 x 1033 gr (mass of the sun), k = 6.7 x 10-8 gr~ cm?/sec’?, 
G= 3: 10M cay sec. 

For the motion of Mercury the term km,/h? is of the order 107”, 
whereas 3km,u2/c? is of the order 10-*1. These estimates justify us in 
attempting to solve equation 105.13 by a method of successive approxi- 
mations sketched in the following section. 


106. The Advance of Perihelion 


A comparison of analytical results of this section with observed astro- 
nomical data provides us with the best available evidence in support of 
the general theory of relativity. In Sec. 107 we mention the deflection 
of the light beam by the sun and the shift of the Fraunhofer lines toward 
the red end of the spectrum, but the quantitative agreement for these 
phenomena between observations and theoretical predictions is still in 
some doubt. 

The relativistic equation for the orbit of a planet 


du 
do” 


deduced in Sec. 105, can be integrated in closed form with the aid of 
elliptic functions, but the solution obtained in this way does not lend 
itself to a convenient comparison with the corresponding result obtained 
in Sec. 97 on the basis of the Newtonian theory. 

We noted in Sec. 105 that the magnitude of the term 3/?u?, appearing 
in the right-hand member of equation 106.1, is small compared with 
unity, and this justifies us in attempting to obtain a solution of this equation 
by the method of perturbations. Accordingly, we neglect the small term 
3mu* and obtain for our first approximation u, the Newtonian equation 


(106.1) 


+u= ad + 3h?u’), 


dur m 
— + = —, 
im EA 
the solution of which is 
(106.2) u = - [1 + ecos ($ — w)], 


17 In this equation the sectorial velocity A is the sectorial velocity of the classical theory. 


SEC. 106] THE ADVANCE OF PERIHELION 309 


where e is the eccentricity of the orbit and w is the longitude of the peri- 
helion. Inserting from equation 106.2 in the right-hand member of 
equation 106.1 yields 


da A 
(106.3) gaat tH al + hia 
m 6m? 
= pT pi e cos ($ — œw) 
3m? 3m? 
alk err dl + cos 2(¢ = w)] to 


Since planetary orbits are nearly circular (for Mercury, e? = 0.04), 
the contribution of the perturbation term containing e? will be negligible. 
Also the term 3m?3/h* will not have a significant effect on the shape of the 
orbit, but the second term, containing cos (¢ — œ), may have a pro- 
nounced cumulative effect on the displacement of the perihelion. Accord- 
ingly, we simplify equation 106.3 to read 


6 3 
— +u= = + “=r ecos ($ — w). 
The solution of this linear equation is clearly made up of the solution 
u, and the solution of 


d’u 

a + 

dọ 
The result of easy calculations gives us the second approximation u, in 
the form 


3m? A 
(106.4) u= nii + ecos(¢— w) + =i ed sin ($ — o|. 


3 
u = “T e cos ($ — o). 
h 


It will suffice for our purposes to terminate the sequence of steps in 
the scheme of successive approximations at this stage and to regard uz 
as representing the solution of equation 106.1 to a sufficiently high degree 
of accuracy. If we set 


(106.5) do =—¢ 
and note that 
cos ($ — œ) + ôw sin ($ — œo) = V1 + (ôw)? cos ($ — w — 4a), 


where « = tan™! dw = dw, we can write (106.4) as 


(106.6) ig ie, |t + e008 (¢ — o — d0)) 


310 RELATIVISTIC MECHANICS [CHAP. 5 


if we neglect in comparison with unity terms of the order (dw). It is 
clear from equations 106.5 and 106.6 that when a planet moves through 
one revolution, the perihelion advances through an angle 


2 
(106.7) ae a Qn rad. 


Equation 106.6 represents a closed orbit, only approximately elliptical 
in shape, because ôw is a function of ¢. Since u = 1/r, we have 


>=? = 
1+ ecos(¢ — w — dw)’ 


so that the “‘semilatus rectum” / = h?/m. 
Recalling from the geometry of conics that / = a(1 — e?), where a is 
the major axis of the conic, we get 


hk? = ml = ma(1 — e’). 
Inserting this result in equation 106.7 we have™ 


pe 6nrm? sm 
ami—e) a(l—e) 


In this expression m is the mass of the sun. 

For Mercury the quantity e works out to be 4.90 x 10-? rad. This 
angle is very small, but the observational data on the location of Mercury 
during the last century are available, and since this planet has a period 
of 88 days, it completes 415 revolutions per century. Thus the cumulative 
advance of the perihelion in 100 years should amount to 415e = 2.04 x 
10-4 rad = 42" of arc. For planets other than Mercury the corresponding 
advance is too small for accurate experimental determination. Thus for 
Venus it is only 9”, for Earth 4”, and for Mars 1”. 

The actual path of Mercury about the sun is not an ellipse, of course, 
because of the perturbing effects of other planets. We are not in reality 
dealing with a two-body problem. However, perturbations due to other 
planets can be taken into account and the deviations from an elliptical ’ 
path calculated. Such calculations have been performed with great care, 
and it has been found that the advance of Mercury’s perihelion should 
amount to about 42" of arc per century. The Newtonian theory is unable 
to account for the advance of this amount, and the remarkably close 


18 For a different way of deducing the value of e see J. L. Synge, Relativity: The 
General Theory (1960), pp. 294-296, and G. Y. Rainich, Mathematics of Relativity 
(1950), p. 162. 


Sec. 107] CONCLUDING REMARKS 311 


agreement between the relativistic calculations and the best observed 
value can hardly be viewed as fortuitous.” 

It is worth noting that the calculations based on the restricted theory 
of relativity also give a precessional effect when one assumes that a 
particle moves in a field of force with potential V = km/r. However, 
the precession based on such calculations yields results that are not as 
close to the observed value as those furnished by the general theory. 


107. Concluding Remarks 


We conclude this chapter with a mention of the relativistic prediction 
of deflection of light rays by the sun and of the shift toward the red 
end of the spectrum of spectral lines of light originating in dense stars.?° 

Since light is material in nature it must be affected by the gravitational 
field of the sun, and the deviation from the rectilinear path of the light 
ray from a distant star, as it grazes the sun, can be readily calculated. 

The deflection of light rays passing near a large mass can be observed 
during eclipses of the sun when fixed stars in the apparent neighborhood 
of the sun become visible. However, because of the uncertainty about 
the magnitude of experimental errors arising from the difficulty of ob- 
taining sharp photographic images, it is generally conceded that these 
results neither prove nor disprove the general theory. It may be remarked 
that the calculations based on Newtonian theory of gravitation can be 
made to account for about one-half of the observed values. 

Among other experimental evidence cited in favor of the general theory 
is the observed displacement of spectral lines of light emitted from the 
stars toward the red end of the spectrum. Elementary considerations 
indicate that the frequency of vibration of the emitted light from a distant 
star is less than the corresponding frequency on the surface of the earth.” 
If this frequency is associated with the emitted light from the sun, the 
lines of the solar spectrum should be shifted slightly toward the long-wave 
end of the spectrum as contrasted with the corresponding lines of terrestrial 


19 G. M. Clemence gives 42”.56 + 0.94 in Reviews of Modern Physics, vol. 19 (1947), 
p. 361, See also G. C. MeVittie, General Relativity and Cosmology (1956). These 
authors make incisive comments on the difficulty of performing meaningful astronomical 
observations. 

20 See J. L. Synge, Relativity: The General Theory (1960), pp. 238-308, and Secs. 36 
and 37 of G. Y. Rainich’s Mathematics of Relativity (1950). See, also, P. G. Bergmann, 
Introduction to the Theory of Relativity (1942), Chapter XIV, and A. S. Eddington, 
Mathematical Theory of Relativity (1924), pp. 90-93. A critica] survey of the validity of 
predictions of Einstein’s theory is provided by G. C. McVittie, General Relativity and 


Cosmology (1956). f 
21 See references given in the preceding footnote. 


312 RELATIVISTIC MECHANICS [CHap. 5 


spectra. The expected shift for the light emitted by the sun is very small, 
but for the companions of Sirius it is estimated to be about thirty times 
as great as for vibrating solar particles and should be observed with a 
reasonable accuracy. In 1925, Adams measured the “red shift” for the 
companion of Sirius”? and found it to be AA = 0.27 for the line of wave- 
length A = 4000 A. From this determination the diameter of the star 
can be estimated, and it is found to be of the right order of magnitude. 
The evidence here is not conclusive, but it is generally regarded as 
favorable. 

The law of gravitation R, = 0 was generalized by Einstein to the form 
R,; = Ag, where A is a small “universal constant.” Solutions of the 
generalized equation have led to various cosmological theories and have 
given rise to speculations about the expanding universe. We refer the 
reader for detailed accounts to specialized treatises on this subject.** 


22 The shift of the corresponding line in the sun’s spectrum is calculated to be 
AA = 0.008. 
23 A. Eddington, Mathematical Theory of Relativity (1924). 
R. C. Tolman, Relativity, Thermodynamics and Cosmology (1934). 
P. Bergmann, Introduction to the Theory of Relativity (1942). 
G. Y. Rainich, Mathematics of Relativity (1950). 
L. Landau and E. Lifshitz, The Classical Theory of Fields (1951). 
J. L. Synge, Relativity: The Special Theory (1956). 
J. L. Synge, Relativity: The General Theory (1960). 


6 


MECHANICS OF CONTINUOUS MEDIA 


108. Introductory Remarks 


This chapter contains a general formulation of the basic concepts of 
mechanics of continua and a derivation of the fundamental equations 
governing the behavior of continuous media. The treatment contained 
here forms a substantial introduction to nonlinear mechanics of fluids and 
elastic solids. The linearized equations of classical theory appear as 
special cases of nonlinear equations, and throughout the chapter emphasis 
is placed on the unified formulation of equations of mechanics of continua 
in the most general tensor form. 

A systematic development of tensor calculus, with an eye to applications 
to mechanics of continuous media, is contained in P. Appell’s definitive, 
Traité de mécanique rationnelle, vol. 5 (1926), and in A. J. McConnell’s 
Applications of the Absolute Differential Calculus (1931). These are largely 
concerned with the linearized cases. The landmarks in the domain of 
nonlinear theory of elasticity are papers by Leon Brillouin, “Les lois de 
l’élasticité sous forme tensorielle valable pour des coordonnées quelcon- 
ques,” Annales de physique, 3 (1925), pp. 251-298, and F. D. Murnaghan," 
“Finite Deformations of an Elastic Solid,” American Journal of Mathe- 
matics, 59 (1937), pp. 235-260. The essence of Brillouin’s contributions 
appears also in his book Les tenseurs en mécanique et en élasticité, first 
published by Masson et Cie in 1938, and reprinted by the Dover Press in 
1946. 

Among more recent contributions to nonlinear theory of elasticity are 
the books of V. V. Novozhilov, Foundations of Non-Linear Theory of 
Elasticity, Moscow (1947), A. E. Green and W. Zerna, Theoretical 
Elasticity, Oxford (1954), and A. Signorini, Questioni di Elasticita non 
Linearezzata, Rome (1960). An exhaustive critical survey of the founda- 
tions of elasticity and fluid mechanics is contained in two extensive 


1 A brief exposition of the central ideas of Murnaghan’s contributions will be found 
in Chapters 14 and 15 of A. D. Michal’s Matrix and Tensor Calculus (1947). 
313 


314 MECHANICS OF CONTINUOUS MEDIA _ [Cuap. 6 


memoirs by C. Truesdell in Journal of Rational Mechanics and Analysis, 
vol. 1 (1952) and vol. 2 (1953). 

A development of the foundations of continuum mechanics, primarily 
within the framework of linear theories (including applications to me- 
chanics of fluids, elasticity, and plasticity) is contained in W. Prager’s 
Introduction to Mechanics of Continua, Boston (1961). A general unified 
development of geometrically and dynamically nonlinear mechanics of 
continuous media will be found in an excellent monograph by L. I. Sedov, 
Introduction to Mechanics of a Continuous Medium, Moscow (1962). 
Sedov’s monograph to a large extent is based on the close union of classical 
mechanics and macroscopic thermodynamics. This unification permits 
one to construct the general models of gases, liquids, elastic and thermo- 
elastic solids, and of several types of plastic media from a single point 
of view. 


109. Deformation of a Continuous Medium 


We consider a continuum of identifiable material points which at a 
given time-f = tọ fill a certain region of space rọ. We shall refer to tọ as 
the initial time and shall call 7, the initial region. With the passage of time 
the points P of Tọ undergo displacements and at some time ¢ fill a certain 
region 7. In the course of displacement, the initial region 7, is usually 
deformed, and we suppose that the deformation of 7, into 7 is fully 
determined when the motion of every point P is known. To describe the 
motion of points P we introduce a coordinate system X which moves with 
the medium in such a way that the coordinates (x1, x”, x°) of any given 
point P initially in 7ọ do not change with ¢. In addition to the system X 
we consider a fixed reference frame Y, relative to which the coordinates of 
the point P(«!, x, x?) are given by 
(109.1) y =y (s pat, 278). 

The functional form of relations (109.1) clearly depends on the nature 
of deformation of 7, into r. We shall assume that the functions y‘(z, t) in 


(109.1) are single-valued, piecewise smooth, and possess for each value of 
time ¢ a single-valued, piecewise smooth inverse i 


(109.2) xt = aie? y’, t). 


The fixed coordinate system Y, without loss of generality, can be assumed 
to be orthogonal cartesian. 


A material point P in 79, relative to an orthogonal cartesian frame Y, is 
determined by the position vector (Fig. 51) 


(109.3) To = CY = cy (al, 2°, 23, to), 


Sec. 109] DEFORMATION OF A CONTINUOUS MEDIA 315 


Fig. 51 


where the c, are the orthonormal base vectors associated with the frame Y. 
The location of the same point P in the region 7 is determined by the 
vector 


(109.4) r = cy'(x, x2, xè, t). 


The base vectors b, in the moving frame X are given by 


(109.5) p = ee CVD 


7 ax! “Ax? 


2 


and these vectors obviously depend not only on the coordinates «’ of P, 
but also on ¢. When P(z!, x?, x?) is in 79, we denote the base vectors b; by 
a, so that 

i : t 
(109.6) E E LA 


1 zi at? 


Thus, in analyzing the deformation of a continuous medium, we can speak 
of three reference frames: a fixed reference system Y determined by the 
basis ¢;, a moving reference frame X with the basis b,, and a fixed reference 
frame X with the basis a, We emphasize that the labels (x1, x”, x°) of a 
given material point P in both curvilinear coordinate systems X have the 


316 MECHANICS OF CONTINUOUS MEDIA [CHAP. 6 


same values, but to avoid circumlocution we shall denote the point 
P(x!, x, x3) when it is located in the initial region 7) by Po- 
Let P,’ be a point in the neighborhood of P,(x', x’, x). The vector 


meee j 
P,P) = dr, can be represented in the form 


(109.7) at, = a, dx 
and the square of the arc element dso in Tọ is 


(ds)? = dry + dro = a;+ a; dx’ dv’, 
or 


(109.8) (CE =f ee ae, 
where h, = a;+a,; are metric coefficients in 7). Similarly, the square of 
— 


the element of arc ds determined by the corresponding vector PP’ = dr = 
b; dz’ in 7 is 
ds? = b + b, dz’ dx’, 


or 
(109.9) Gs ea d 


where the g,; = b,- b, are metric coefficients in 7. Ordinarily the lengths 
and the orientations of vectors dr, and dr will be different, and we shall 
say that the medium occupying 7 is strained whenever dsy Æ ds. We can 
take as our measure of strain the difference 


(109.10) (ds)? — (dso)? = (g; — h,;) drt dx’, 
and, if we set 

(109.11) Zu — hiy = 2€;;, 

we can write (109.10) as 

(109.12) (ds)? — (dso)? = 2e; dx’ dx’. 


Since (109.12) is an invariant and «€, = €,,, we conclude that the set of | 
functions ¢,,(x, t) represents a tensor Ey with respect to a class of ad- 
missible transformations of coordinates X, with the basis a,, covering the 
region Tt). The same set of functions e,,(7, t) also determines a tensor E 
with respect to a set of transformations of coordinates determined by the 
basis b, of the final state 7. In the notation of the concluding paragraph 
of Sec. 45, the tensor Ey is specified by the multilinear form E, = e ata’, 
whereas the tensor E is determined by E = e,,b’b’. Thus the operations of 
covariant differentiation and those of raising and lowering indices on the 


Sec. 110] GEOMETRIC INTERPRETATION OF E, AND EÈ 317 


components of Ey involve the metric tensor h;;, whereas the corresponding 
operations on E make use of the tensor g;;. Accordingly, 


ij re; T O 
hve, = G and grey = Ep. 


However, the two sets of functions «,/ so computed in general are distinct, 
and to indicate the origin of the set e,’ obtained with the use of the tensor 
hi; we shall write 

hte, = rA 


It will be shown in the following section that either of the tensors £o or 
E can serve to characterize the state of deformation of the neighborhood 
of P,- Tensors E, and E are sometimes called, respectively, the Lagrangean 
and Eulerian strain tensors in accordance with the two viewpoints of 
hydrodynamics, associated with the choices of coordinates of the initial or 
final states as independent variables in the formulation of hydrodynamical 
equations. 


110. Geometric Interpretation of Strain Tensors Eo and E 


In the preceding section we defined the set of functions e; by the 


formula 
(ds)? — (dso)? = 2e; dx dz’, 


where 

(110.1) 2€,; = 25 — Ng: 

Since g; = b,-b, and h; = a; + a;, we can write (110.1) as 
(110.2) Pag = b, > by; — a+ ay 


= |b,| + [b,| cos 6,; — [a;l + las! cos 8%, 


where 0, is the angle between the base vectors b, and b,, and 6,;° is the 


angle between a, and aj. If we denote by e the change in length per unit 
-A 
length of the vector dro = P,P,’ in Fig. 51, so that 
= |dr| aa |drol = ds zm dS» 


|dro| dso 
we have 


(110.3) \dr| = (1 + e) |drol. 


We call e the elongation of dro and we see from (110.3) that the elon- 
gations e; in the directions of base vectors a, are given by 


(110.4) \b,| = (1 + e) lad: 


318 MECHANICS OF CONTINUOUS MEDIA — [Cuap. 6 
However, |b;| = Le and |a,| = Bie so that 
(110.5) V8: ee ed hy (no sum on i), 


and hence formula 110.2 can be written with the aid of (110.4) and (110.5) 
as 


2€;; 
(110.6) 


ve 


Since 0,,° = 0, = 0 for i = j, equation 110.6 yields 


——__. = (1 + e(l + e,) cos 0,; — cos 6}.. 


or 


(110.7) T “tu T oem, 


When the coordinates of the initial state are rectangular cartesians, 
h,; = 1, and we see from (110.7) that for 2e,,/h;, & 1, e;== e Accordingly, 
the functions €,,, €22, €33 are related to the elongations of arc elements 
directed along the base vectors aj, ao, a3. 

The significance of the «e, for i Æ j follows from (110.6) on noting that 
when a, and a, are orthogonal unit vectors, 6,;° = 7/2. If we set 0, = 
m/2 — «,;, so that «,; represents the change in the initially right angle 
between the pair of arc elements directed along a; and a,, formula 110.6 
gives 

2e; = (1 + e;)(1 + e;) sin a,;, 
or 

2€;; 

JEF gene) Lee 
where we recalled (110.7). If 2e, « 1 and the angle «,; is small, we have 
an approximate equality, «,; = 2e;. Thus the functions e, for i #/ 
provide a measure of the decrease in the initially right angle between the 
arc elements parallel to the vectors a; and a;. The components e; for 
i # j are called shearing components of the strain tensor Ey, and the com- , 
ponents e,, for i = j are the normal components of Ey. 

Quite analogous interpretations can be provided for the functions «;; 
when these are viewed as components of the tensor E = ebb’. indeed, 
if we now define the elongation e as the change in length per unit final 
length |dr| of the arc element so that 


= ds as dsy 
ds ` 


(110.8) sin x; = 


Sec, 111] STRAIN QUADRATIC 319 


the calculations similar to those that have led to formulas 110.7 and 110.8 
now yield 


(110.9) pai a 


and 
2€;3 
Jt — 2ey/1 — 2e;; 


ji 
where B,; = 0? — 7/2. 

We conclude, as before, that the components «;; in (110.9) are associated 
with elongations of the arc elements originally parallel to the base vectors 
b; whereas the components e; for ij measure the corresponding 
shearing deformations. 


(110.10) sin B;; = (no sums), 


111. Strain Quadric. Principal Strains 


The defining formula 109.12 for components €;; of the strain tensor 
E = «¢,;b’b’ can be written as 


2 2 i : 
(111.1) (ds) ems) ae 
2(ds)° ds ds 


where dzx‘/ds = 2’ is the unit vector determining the direction of the 
vector dr in the final state. We seek to determine those directions A’ for 
which (111.1) takes on extreme values. Accordingly, we set 


(111.2) Q(A) = «did 
and maximize the quadratic form Q(A) subject to constraining relation 


requiring that 4‘ be a unit vector. 
The familiar procedure for determining the extreme values of (111.2) 
by the method of Lagrange multipliers leads to the system of equations 


a0 — e $ = 0 
On" on’ 

or 

(111.3) (e — gu) = 9, 


where « is the Lagrange multiplier. l 
This system possesses nontrivial solutions for 4° if, and only if, 


\e,(x) — gyl) = 0 


320 MECHANICS OF CONTINUOUS MEDIA — [Cuap. 6 


at each point P(x) of the region r. In order to reduce this syaani 111.3 to 
the form 13.10 considered in Sec. 13, we multiply (111.3) by g“, sum on j, 
and obtain 


(111.4) (e* — «d,;*)A’ = 0, 
where 
CIS) ef = ges: 


The system 111.4 has three nontrivial solutions My Rs, A = @ = 1, 2, 3), 
corresponding to the roots «€; of the cubic 


(111.6) le? — €6;| = — è + Ie — de + 8, = 0. 
The coefficients #, in this cubic are the invariants 


Üi = € tet ss, 
Guilt 7) Oo = Eg€, + €:€, + €,€o, 


Bs = €1€n€,. 


It was shown in Secs. 13-15 that the roots «e; are necessarily real and 
the directions 2%), Aig), Ats) associated with them are orthogonal. 

The quadratic form 111.2, where we regard the 2’ as the running co- 
ordinates, reduces to the canonical form 


(111.8) Oy) = ay’)? + ex(y’) + €(y*)?, 


provided that the principal directions Aii), ża», 213, are chosen as the base 
vectors of a suitable orthogonal cartesian reference system Y in 7. 

We can interpret these results geometrically by introducing a strain 
quadric 


(111.9) €,,(x)A‘2? = constant, 


which, at each point P(x), represents a quadric surface with the A’ as the 
running coordinates. The principal directions 4',, coincide with the axes 
of the quadric 111.9, and it follows from (111.8) that the strain tensor e,,, 
when referred to the frame Y, has the form 


a 0 0 
0e ‘0.7. 
G0 seq 


From the geometrical significance of components «;;, i Æ j (see equation 
110.10), it follows that the principal directions are those orthogonal directions 
in the undeformed state which remain orthogonal after deformation. 

The strains €, €, €, are termed the principal strains. 


Sec. 111] STRAIN QUADRATIC 321 


The invariants #,, 0, Ð defined by (111.7) play an important role in 
the construction of models of continuous media. If we expand the 
determinant in (111.6) and equate the coefficients of like powers of e in the 
result, we find ” 


s 1 2 323 i 
O, = & + € + e = €j, 


2 3 3 1 T 
E€ €3 E€ € € € E 
D, = Paa SF ile + j Mi= a Gere: 
2 3 € Pah 
(111.10) $ - 1 2 
€ ay Es" 
Bi 3 2 gl eee 9 
O,= 6, €& € |= 3i Onpy€i Ej Er 


3 
€ €5 GE 


We will see in the following section how these invariants enter in the 
expression for the ratio of the volume elements dro and dr of the initial 
and deformed states. 

We could have equally well considered the quadratic form 


(111.11) Qo = Eujhg Ay’ 


with A,’ = dr‘/ds, specifying the direction of the vector dro of the initial 
state, and with e,,’s regarded as components of the strain tensor Ey = 
e,;a°a’. 

For the determination of principal directions we now have the set of 
equations of the type 111.4 in which 


(111.12) Ge he, 
and the values of « are the roots of the characteristic equation 
(111.13) Jet — €d,*| = 0, 


in which the e are given by (111.12). The quadratic form 111.11 can thus 
be reduced to a canonical form 


Qo = (Y + P + UY 


when the principal directions Aj), Kiva), Aba) are taken as the basis of a 


suitable orthogonal frame Yo in To. 
It follows from formulas 110.7 that the elongations ep along the 


principal directions are 


(111.14) A © fe = 1, 
dS 


522 MECHANICS OF CONTINUOUS MEDIA [CHar. 6 


whereas the elongations e, reckoned per unit length in the final state 
(cf. equation 110.9), are 


(111.15) e= aed 
ds 


We conclude from (111.14) and (111.15) that 


=f ff = Dew 


Ee. 


me (eee 


eee 
1 i 2e; 
Formulas 111.16 permit us to express the invariants 7,° of the cubic 


(111.13) in terms of the invariants ĝ, given in (111.7), and the invariants 
ð, of the cubic (111.6) in terms of the #,°. 


(111.16) el 


and €; 


Problem 
Show that 


9,° + 40? + 1283? 


h= + 40. + 80,0’ 
ia Da + 63,° 
2 1 + 20,9 + 49,9 + 88,°’ 
p 3 
Os = 


~ 1 +20,° + 40,° + 894°" 


112. Distortion of Volume Elements 


We investigate next the change in volume elements dr, and dr of the 
initial and deformed states and indicate its connection with the invariants 
ð; introduced in Sec. 111. 

It follows from the definition of the volume element in Sec. 44 that 

dry = Vh di d? dè? and dr = Vg dr dr? dæ, 
where h = |h,,| and g = |g;;| are the determinants of the quadratic forms 
ds? = h; dx dx’ and ds? = g, dx‘ dx’, 
Thus 


(112.1) LY Pe 


The set of functions h,,(x) can be regarded as components of the tensor 
H = h,;b'b’ defined in the space of the variables x’ in the final state, so that 


git hy; = hë 
and 


IS ah = h; 


Sec. 112] DISTORTION OF VOLUME ELEMENTS 323 


We conclude that 
legah = lhal 


£ |h; =h. z 


so that 


Consequently the ratio 112.1 assumes the form 
dTo AiL ial 
a22) — = VIh;'|. 
dr 


But from definition 110.1 we have 
h(x) = gu) — 2€,,(2), 
which, upon raising the indices, reads 
hj = 6, — 2e;'. 
We can therefore write formula 112.2 as 
(112.3) ce ee 
dr 


If we expand the determinant appearing under the radical sign, we find 
(112.4) |ô — 2e,'| = 1 — 20, + 40, — 80, 


where the #, are the invariants 111.10. 
In the linear theory of deformation the products of strains «e; can be 
disregarded, so that an approximate expression for the ratio 112.3 is 


ai To, 
dr 
= il Ta Oy. 
Thus approximately 
(112.5) dr — dty_ o 


dr 


This represents the change in volume per unit volume, and, for this 
reason, ® is called the dilatation. It figures prominently in linear elasticity 


and hydrodynamics. 
Formula 112.3 can be cast in the form 


dr 1 
dr 66 +269) Vi + 20° + 40° + 805" 


324 MECHANICS OF CONTINUOUS MEDIA [CHaP. 6 


by expressing the invariants ð, in terms of the #,° as in the Problem of 
Sec. 111. When the deformations are small, it follows from (112.6) that 


dr — dt) 


V 0 
dr, 5 


CT) 
and, since for small deformations ¢,° = e, ® ° = 0, both formulas 
112.5 and 112.7 give the same value for the dilatation. 


Problem 
Obtain formulas 112.3 and 112.6 directly from (111.14) and (111.15). 


113. Displacements in Continuous Media 
We define the displacement vector & of the point P, (Fig. 51) by 
(113.1) E=r—f 


and denote the components of & relative to the basis a; by u’ and its com- 
ponents relative to the basis b; by w’. Thus 


(113.2) B=wa, § = wb, 
From (113.1) we have 
ea ee 
rt Ox’ ax" J 
so that 
(113.3) eigen 


On computing g; = b;:b, with the aid of (113.3) and subtracting 
h,; = a; ' a; from the result, we find 


0b 06 26 24 

113.4 u — hy = > E 4 oo 

( ) E ~ Bat Ba! a Jz +a; a! 
= 2e, by (110.1). 


Equations 113.4 can be regarded as a set of differential equations for the 
components of & when the functions e,; are specified. This set of equations 
assumes quite simple form when the displacement vector § is expressed in 
terms of its covariant components u; or w;, so that 


(113.5) E=ua’, E= wb, 


the a’ and b’ being the reciprocal base vectors introduced in Sec. 45. 


Sec. 113] DISPLACEMENTS IN CONTINUOUS MEDIA 325 
On differentiating (113.5) with respect to x‘, we get (cf. Sec. 45) 


ə i 
(113.6) + = Uja’, z = w; b, 
where 
Ou; k 
137 Ui = Z — ti a 
a ) | Ox’ h jl sd 


is the covariant derivative of u, with respect to the metric A,; of the initial 
state and 


_ Ow; K\ a 
(113.8) Wii = T i Wi, 


is the covariant derivative of w, with respect to the metric g,, of the final 
state. The prescripts on the Christoffel symbols in (113.7) and (113.8) 
indicate that these symbols in (113.7) are constructed from the tensor h,,, 
whereas those in (113.8) from the g;,’s. 

If we insert from the first of formulas 113.6 in 113.4, we get 


ane l k k k 
2e; = (Uya * Uy A) + (a,-a°u,); + aoa Ugi) 
P l. ak k k 
= Uy) Uy, a + ô; ur + 6; Un: 
@ 
= Up Uys + Uys HU ye 
since al - at = h”. 
Thus 
k 


On the other hand, when & is represented in the form § = w,b’, we recall 
(113.3) and write 


Pp ~).(v,- %) 
hy; = a; a, = (b Jri iT 5a} 


The substitution into this expression from the second of the formulas in 
(113.6) yields 


(113.10) 26,5 = Wig + Wie — We, s- 


Formulas 113.9 enable us to compute the strain components «,; from 
components u; of the vector E referred to the basis a; of the initial state. 
Formulas 113.10, on the other hand, involve the components of § relative 
to the basis b, of the final state. Alternatively, when the functions €, are 
specified, equations 113.9 and 113.10 are differential equations serving to 
determine the components of the displacement vector E. 


326 MECHANICS OF CONTINUOUS MEDIA _ [Cuap. 6 


When the reference system X is orthogonal cartesian, we set y! = x’ 
and obtain from (113.9) and (113.10) 


Ou; , Ou; , Ou, Ou, 
Wea 2e = 1 + 4 4+ —*— +, 
( ) sii Dyt dy Oye! Oyo? 


(113.12) 2e; = o + aw = Ow ite > 
dy’ Oy y dy’ 

where the labels y)’ refer to the cartesian coordinates in the initial state. 

In special problems the derivatives of the displacement components may 
be sufficiently small to justify one in neglecting products of these derivatives 
in comparison with the first-order terms in these derivatives. In this event 
equations 113.11 and 113.12 become linear and the theory of deformation 
based on a study of resulting linear differential equations is called the 
linear theory. Yn the linear theory it is usually assumed that the dis- 
placement vector §& is so small that one is justified in identifying the co- 
ordinates y and y’ of the initial and final states. The resulting theory is 
called the infinitesimal theory of deformation. In the infinitesimal theory, 
formulas 113.11 and 113.12 coalesce and we write 


(113213) 2e;; = Uz; H Uja 


where the e, are the infinitesimal components of the strain tensor e,;. In 
classical theory of elasticity, the strain tensor «e, is taken in the form 
(113.13). The strain invariant #, = ey, + ess + e33, as follows from 
(113.13), is then equal to divergence of the displacement vector u; and 
hence the dilation 0, = (dr — dry)/dr = u',. 


114. Equations of Compatibility 


Equations 113.10 or, in cartesian form, 113.12, can be viewed as a 
system of six simultaneous partial differential equations for the deter- 
mination of three components of displacement from prescribed values of 
the strain tensor. Clearly, if a solution of this system is to exist, com- 
ponents of the strain tensor cannot be specified arbitrarily. To ensure the - 
integrability of the system it is necessary to impose certain restrictions on 
the choice of functions «€; Such conditions were deduced and the proof 
of their necessity,’ for the linearized case typified by equation 113.13, was 
given by B. Saint Venant in 1860. We indicate here how these integrability, 
or compatibility, conditions can be deduced in the general case. 


* For a proof of necessity and sufficiency of Saint Venant’s conditions see I. S. 
Sokolnikoff, Mathematical Theory of Elasticity (1946), pp. 24-25. 


Sec. 115] ANALYSIS OF STRESSED STATE S27 


We recall that the space in which the deformations take place is Euclid- 
ean, and hence the Riemann tensor, associated with the metric of Euclid- 
ean space specified by ds)” = h,; dx’ dx’, vanishes (see Sec. 39). Thus 


= G a oc 0 : E x A r 
M4) Ro = Uhi — Lik, i+ o de a to, 


where the Riemann tensor Rẹ; is formed from the metric coefficients h;;. 
If we recall that (see 110.1) 
hy = Zy — Zeij 


compute the Christoffel symbols needed in (114.1) in terms of the g,; and 
€; and make use of the fact that the Riemann tensor R,;,, based on the 
g,;'s also vanishes, we get the condition 


(114.2) Eix t h? (e spite — €51p€ x2) = 0, 
where 
Eije = € sree F Eikit — Eijk T Ekl,ij 
Eije = Eiri F Ekji — Eijo 
and 
H? 
h > 
H* being the cofactor? of h,, in |h. 
If we linearize (114.2) by dropping terms involving the products of the 
we get Saint Venant’s compatibility equations 


re = 


€ sik» 
(114.3) eir + erris — Cixi — sre = O, 
familiar in the linear theory of strain.’ 

From the fact that in a three-dimensional space the Riemann tensor has 
six independent nonzero components, if follows that there are six in- 
dependent equations in (114.2) and (114.3). 


115. Analysis of Stressed State 


In analyzing the state of stress in a deformed body, it is natural to use 
the variables x‘ of the final state as the independent variables. We will 
demonstrate that the state of stress at a point P(x) of a body, in equilibrium 


3 Note that the contravariant tensor h“ is the associated tensor of h;; with respect to 


the metric tensor gi;. See Sec. 30. 

4 See in this connection a paper by W. R. Seugling, American Mathematical Monthly, 
vol. 57 (1950), pp- 679-681; also F. D. Murnaghan, Finite Deformation of an Elastic 
Solid (1951) and L. I. Sedov, Introduction to Mechanics of a Continuous Medium (1962), 


pp. 128-130. 


328 MECHANICS OF CONTINUOUS MEDIA [CuaP. 6 


KS T do 


under prescribed surface and body forces, is characterized by a symmetric 
tensor, the stress tensor. 

Let a body 7 be referred to a curvilinear coordinate system X, and 
consider an element of surface area at some point P’ of the body. Leta 
small tetrahedral volume element dr be formed by the coordinate surfaces 
at a nearby point P and by the surface element do (Fig. 52). Ifv is the unit 
normal to do then the elements of area do; lying in the coordinate surfaces 
are given by the formulas 


(115.1) do, = v, do, 


where the v; are the covariant components of v. 

We denote the stress vector (force per unit area) acting on do by T 
where the superscript v brings into evidence the dependence of the stress 
vector on the orientation of the element do. The stress vectors acting on 


t 
the surface elements do; are denoted by T, and we take as their positive 
directions the directions of the exterior normals to the volume element. 
We can write 


(115.2) T = —71"b,, 
where the b, are base vectors directed along the coordinate lines and the 


l 
q” are the contravariant components of T. 
Now, if F = F’b; denotes the force per unit volume acting on the mass 
contained in dr, the first condition of equilibrium requires that 


(115.3) F dr + T do + T do, =0. 


Sec. 115] ANALYSIS OF STRESSED STATE 329 


If we note the definitions 115.1 and 115.2 and observe that dr = ldo, 
where / is the appropriate factor depending on the linear dimension of the 
volume element, the equilibrium condition 115.3 becomes 
, F'b;il do + Tb, do — rv, dob; = 0, ‘ 

where T’b, = T. 

If the point P’ is now made to approach P so that the direction of v 
remains fixed, l — 0, and the first term in the above relation will surely 
vanish whenever the body force F is bounded. This leads to the result that 


the components T’ of the stress T, acting on a surface element with the 
orientation v, are given by the formula 


(115.4) Fay. 


Since T'is a vector and v, is an arbitrary covariant vector, we conclude 
that the 7” are the contravariant components of a tensor, the stress tensor. 
Formula 115.4 permits us to calculate the stress vector acting on a surface 
element with the specified orientation whenever a set of nine functions 7” 
is known. We will see in Sec. 116 that the application of the remaining 
condition of equilibrium leads to the conclusion that the stress tensor is 


symmetric. 
We can obviously write (115.4) in the form 
(0555) T; = 73". 


The component N of the vector T in the direction of the normal v is 
T -v = 7,”, so that, using (115.5), 
(115.6) N = 7,;v'0’. 


In regard to the quadratic form 115.6, we can raise the question of 
determining directions v* such that N takes on the extreme values. As in 
Sec. 111, this leads to the consideration of the characteristic equation 


(115.7) Ir — rôt] = —7? + 0,7? — Oar + O = 0, 
where 
O, = Ti + 72 + Ts 
Oz = TaT + T3aT1 + 7172) 
O3 = T1727, 
the 7; being the roots of the cubic (115.7). The orthogonal directions y? 


corresponding to the principal stresses 7; are determined from the set of 
linear equations (cf. equation 111.4) 


(115.8) (rE — tô = 0, 


330 MECHANICS OF CONTINUOUS MEDIA — [Cuap. 6 


and are called the principal directions of stress. If we choose an orthog- 
onal cartesian frame Y whose axes coincide with the principal directions 
at P, the quadric surface 


(115.9) T; = constant 
assumes the form 
(115.10) TYD + TAYE + 7,(y*)? = constant. 


The quadric surface 115.9 was introduced by Cauchy, and it is called the 
stress quadric. 

It is obvious from (115.10) that the components 7,,, for i Æ j, vanish 
when a suitable reference frame is chosen at P. The components 7, 
Top, Taz are called the normal components of stress, and the remaining ones 
are shears. 

By analogy with formulas 111.10, we can write down the expressions 
for the stress invariants ©;. They are 


(115.1 1) O; = ae O, = 2! T i ©; == 


116. Differential Equations of Equilibrium 


Let a body 7 be in a state of equilibrium under the action of prescribed 
body and surface forces. Since every portion of the body is in equilibrium, 
the resultant of all forces and the resultant moment of these forces acting 
on every subregion V of 7 must vanish. The condition that the resultant 
force in every direction vanishes yields the equation 


(116.1) | F'), dr +Í E = 0, 
V S 


where A, is the unit vector in an arbitrarily fixed direction. 
We assume that the components of body force F’(x) are continuous 
functions and that the components T’ of the stress vector are of class C1. 
The substitution for 7‘ from (115.4) and the application of divergence , 
theorem 92.3 to the surface integral in (116.1) yields the equation 


I [F‘A; + (7*2), ;] dr = 0. 
yV 


Since 4, is a parallel vector field, 2; ; = 0, so that the preceding equation 
can be written 


(116.2) { (Fi + 7)A,dr = 0. 


Sec. 116] EQUATIONS OF EQUILIBRIUM SSi 


Since the integrand in (116.2) is continuous and the direction of A, is 
arbitrary, we conclude that, at every point P of 7, 
(116.3) Ti+ F=. 


v 


We apply next the condition that the resultant moment of the body and 
surface forces vanishes. Ifr = /'b, is the position vector of the point P’(x) 
relative to some point P, the component of the moment (F x r) dr in the 
direction of the unit vector Ais F x r» à dr. The component of the moment 


due to the surface forces T is T x r-Ado. Recalling (Sec. 49) the ex- 
pression for the triple scalar product 


enables us to write 


Í eix F VA" dr +f ents do = 0, 
K S 


The substitution in the surface integral from (115.4) and the application of 
the divergence theorem yields 


fi eiA EV + (71?) m] dr = 0, 
SINCE Eik, m = 0. 


If we carry out the indicated covariant differentiation and make use of 
equations 116.3, we get 


Í em aan dx — 0; 
" 


and, since’ l}, = 67, and V is arbitrary, we conclude that 


(116.4) EiT A = 0. 
Noting that €;, = —€;ix enables us to write this in the form 
(116.5) Feil T” — 7”)A* = 0. 


Since é; = V g jx, and Vg # 0, we have, upon expanding (116.5), 
(723 — TJA + (731 — 7°) + (7? — 1”)A = 0. 

Inasmuch as the direction of A is arbitrary, we conclude that 

(116.6) 7 = rit, 

Thus the stress tensor is symmetric. 


a 
5 For: b; = = = lfp; by (46.6). Hence I’, = ô). 


332 MECHANICS OF CONTINUOUS MEDIA [CHAP. 6 


We summarize these results in a 

THEOREM. If a body is in equilibrium, under the action of prescribed 
body and surface forces, then the components of the stress tensor 7” at each 
point of the body satisfy the system of partial differential equations 


74 +F'=0, 


where ri = t}. On the surface È of the body where stress vectors T’ are 
assigned, 
i 
v, being the exterior unit normal ot È. 
We can write down at once the equations of motion by invoking the 
principle of D’Alembert. We merely have to add to the body force F' 


the inertial force — pa’, where p is the density and a’ is the acceleration. 
Thus the equations of motion are 


(116.7) Ti + Fi = pa’, 


where F’ is the body force per unit volume. If F’ represents the force per 
unit mass, the equations of motion read 


(116.8) T} = plad — F’. 


Since all equations in Secs. 115 and 116 appear in tensor form, they are 
valid in all admissible reference frames. In particular, in the reference 
frame X of the initial state 7), the covariant derivatives in (116.8) are 
taken with respect to the metric coefficients ,; and the a’ and F’ are com- 
ponents of the acceleration and force vectors relative to the basis of the 
initial state. 


117. Virtual Work 


Let a continuous medium be maintained in the state of equilibrium by 
— 
the body forces F‘ and surface forces T’. If P,P = E(x, t) is the displace- 


ment vector of the point P in (Fig. 51), we can consider a point P’ in the 


—>- 
neighborhood of P, and denote the vector P,P’ (not shown in Fig. 51) by 
E(x, t), Thus 
EB (@, t) = E(x, t) + dE (a, t), 
where the variation 


(117.1) òE =F — E, 


—> 
or the virtual displacement of P, is an arbitrary vector PP’ in the neighbor- 
hood of P. We consider the variation of vectors only in the final state 7 


Sec. 117] VIRTUAL WORK 333 


and we shall say that the variations of vectors and tensors associated with 
the points P, of the initial state 7, is zero. 


We suppose that & is of class C? and define the variation of 0&/dx' by 
(117.2) A(Z) E _ O o 
Ox" OL On. Ox" Ox’ 
so that the variation of the derivative 0&/0x' is equal to the derivative of 
the variation (cf. Sec. 81). Since § = r — ro, and 


o% or =O 


=e 


ðt ox r 


we obtain, on utilizing the distributive property of the symbol ô, 
A(Z) = ô(b; — a) = ôb;, 

for da; = 0, since points in the initial state are not varied. Thus 

(117.3) ôb; = (2) = ae) 


Ox" Oxi ` 


The metric coefficients of the final state are given by g;; = b; b; and 
we find, as in Sec. 81, 


ôg; = 6(b; - b;) = b; - ôb, + b; ob, 


so that 

(117.4) ion p A +, a). by (117.3). 
The strain tensor e, was defined by 

[109.11] 2e; = Zy — hy 

and hence 

(117. 5) 20€;; = Ô Zijp 


since ô(h;;) = O(a; ` a;) = 0. 
On substituting from (117.4) in (117.5), we get 
0(08) 0(08) 
eh 1 Oe’ 
But ôE = (6&),b’, where the (8), are the covariant components of the 
vector 6, and, since 


(117.6) 28e; = b; 


334 MECHANICS OF CONTINUOUS MEDIA — [CuapP. 6 
by (46.8), we conclude that (117.6) can be written as 
(117.7) 26e,; = (08); 5 + (08); i 


If we form the inner product of the vector (6§), with both members of the 
equilibrium equation 


(117.8) Ti = — pF’, 


where F’ is the body force per unit mass (cf. 116.3), and integrate over the 
body 7, we get 


(1179) f Hdr = = È pF; ar, 


But 7'4(6E), = [7°(68),],, — TYE), ; so that (117.9) can be written as 


[ Ear- [ 2408), ar = = | pF, dr, 


or 
i T86); v; do -Í T(E); ; dr = -Í pF'`(ô§); dr, 


where we transformed the volume integral over 7 into the surface integral 
over surface X bounding 7. 
Since ry; = T* by (115.4) and 


T(E); 5 = Fr” LOE) s + (08); ] = 7” de 
by (117.7), we have finally 


(117.10) I tbe, dr =f; T'(08), do +| pF'(68), dr. 


By definition, the surface integral in (117.10) represents the virtual work 
performed by the external surface forces T” in a virtual displacement (68),. 
The volume integral in the right-hand member of (117.10), on the other 
hand, represents the virtual work done by the body forces F'. If we denote 
the virtual work done by body and surface forces by 


On) oo) T° (68), do +f pF'(68), dz, 
we can write (117.10) as j 
(117.12) w= fe de, ; dr. 


if in the foregoing calculation instead of the equilibrium equations (117.8) 
we considered the dynamical equations, 


[116.8] ri = p(a' — FS, 


Sec. 117] VIRTUAL WORK 355 


we would have obtained in the left-hand member of (117.10) the additional 
term 


(117.13) ôK =Í pa'(6&); dr. 


This term has a simple mechanical interpretation when the virtual dis- 
placements (68), are the actual displacements (d&); that take place in a body 
whose motion is governed by equations 116.8. In this event we write 
(117.3) as 


(117.14) dK =| pa‘(d&), dr. 
But the velocity of a point P in 7 is 
CON 
dt 


and we can thus write (117.14) 


dK -Í pav; dr dt. 


Now, in the orthogonal cartesian coordinates, 
ld = 1 dv)? 


am Sale may 


and hence 
dK =| 3d(v)*(p dr). 


The integrand in this integral represents an increment in kinetic energy of 
the element of mass dm = p dr acquired by it in the interval of time 
(t,t + dt). Thus dK represents an increment of kinetic energy K = 


Apu? dr. Accordingly, for the motion of a body 7 governed by equations 
116.8, we have an important result: 
(117.15) dK + dA = dW. 
where 
(117.16) 
dA = | t’ de;dr and dW= it T'(d&), do +Í pF'(d&), dr. 


In the static case dK = 0 and dA = dW. l 
The results of the section, coupled with some thermodynamic con- 
siderations, form the basis for constructing the theoretical models of 


elastic bodies, viscous fluids, and so on. 


336 MECHANICS OF CONTINUOUS MEDIA — [Cuap. 6 


118. Laws of Thermodynamics 


The construction of mathematical models of different types of continuous 
media hinges on the use of certain energy concepts that enter in the 
structures of mechanics and thermodynamics. We borrow from mechanics 
the notions of potential and kinetic energy and from thermodynamics the 
somewhat less sharply defined concepts of chemical energy, heat energy, 
electrical energy, and so on. We shall suppose that functions defining 
various kinds of energies depend on a number of parameters, some of 
which are variables (positional coordinates, temperature, densities, strain 
tensors, and so on), whereas others are physical or universal constants. 
The totality of constant parameters c, and variable parameters q* chosen to 
describe a given function need not be unique. But whatever particular 
choices of a set of parameters is made, we shall assume that the g i=l, 

. , n) are independent. 

In some special situations the g’ may be determined as functions of a 
scalar ¢ (usually time) so that one can regard them as defining a curve 


C: g =), 
characterizing a certain process. 


In the preceding section we introduced the notion of work or mechanical 
energy by considering linear forms of the type 


(118.1) OW = O o nps Cys.» C C 


The line integral fo. dq’ then represents the work done along the path 


C by the senerilized forees Q,. Ordinarily, such integrals depend on the 
path C associated with a given process. 

We shall suppose that a particle of mass dm = p dr, where p is the density 
and dris the volume element, may acquire energies other than mechanical, 
and we shall represent such accretions of energy in the form 


(118.2) OE = Figs... g E a Copied’. 


If dE includes all energies other than mechanical, the total amount of 
energy acquired by the particle is determined by the integral í 


(118.3) ` it (SW + òE) =| F, + Q;) dq’. 


From the principle of conservation of total energy we conclude that the 
integral 118.3 must vanish for an arbitrary closed path C, and hence the 
integrand (F, + Q,) dq’ is an exact differential of some function U(q', ..., 
q”; Ci,- - , Cm) determined to within a constant of integration. We shall 


Sec. 118] LAWS OF THERMODYNAMICS 337 
call Ū the total energy per unit mass and define the internal energy U per unit 
mass by the formula 


(118.4) U=U — be’, 


LA 


where v is the velocity of the element of mass dm. The amount K = 50? 
represents the kinetic energy per unit mass, so that the total energy 


Gi U AK 


We can thus formulate the basic law of conservation of total energy in 
the form 


(118.5) ôK + ôU = ôW + OE, 


where the left-hand member in (118.5) is the sum of the increments of 
kinetic energy K and internal energy U acquired by the unit mass. 

When ôE consists only of the heat energy 6Q, we have the statement of 
the First Law of Thermodynamics: 


(118.6) ÔK + ÒU = ôW + 60. 


The heat energy 5Q, as shown in works on thermodynamics, can be 
determined by specifying the temperature T. Experiments show that heat 
invariably passes from bodies with higher temperature to those with lower 
temperature and that the transfer of heat from one body to another is 
wholly determined by 7 and, of course, by certain physical parameters 
depending on constitutive properties of the bodies. Experiments further 
show that it is impossible to construct a machine which transforms the 
heat energy into mechanical energy froma body with the least temperature. 
It is a consequence of this Second Law of Thermodynamics that for every 
reversible thermodynamic process there exists a function S, called entropy, 
such that 


ô 
(118.7) ôS dm = = , 


T being the absolute temperature and 6Q an increment of heat acquired 
by the element of mass dm. 


When the medium is in the state of mechanical equilibrium, the kinetic 
energy K vanishes, and the law (118.6) assumes the form 


(118.8) a6 + 00. 


We shall make use of the laws 118.7 and 118.8 in Sec. 119 to construct a 
mathematical model of an elastic body. 


338 MECHANICS OF CONTINUOUS MEDIA — [CuapP. 6 


119. Elastic Media 


Some bodies possess the property of recovering their original size and 
shape when the impressed forces producing deformations are removed. 
The media of which such bodies are composed are called elastic. In con- 
structing a model of an elastic body we shall suppose that all processes 
taking place in such a body are reversible, but we do not assume that the 
body is necessarily in the state of thermal equilibrium. Thus our thermo- 
elastic model will take account of the effects of temperature on defor- 
mations. 

As our points of departure we take the First Law of Thermodynamics in 
the form [cf. (118.8)] 


(119.1) ôU = ôQ + ôW 
in which 
(119.2) OW =| de; ; dt 


is given by (117.12). 
We also write the relation 118.7 in the form 


6Q =| TdS dm 


or 


(119.3) = T Sp dr. 


If u denotes the internal energy U per unit mass of the body, then 


(119.4) ôU =f ôu dm =f» ôu dr, 


where ôU stands for the increment of internal energy acquired by 7. 
The substitution in (119.1) from (119.2), (119.3), and (119.4) gives 


(119.5) fe ôu dr = | pros dr + [> de,; dr. 


We suppose that the integrands in (119.5) are continuous functions and, 
since the equality 119.5 holds in an arbitrary subregion of 7, we conclude 
that 


(119.6) ier See N 
at all points of r. j 


Sec. 119] ELASTIC MEDIA 589 


The formula 119.6 suggests that we regard w as a function of the in- 
dependent variable S and of the nine independent parameters e,;. Since 
the components «;; of the stress tensor Ey = «€,;a'a’ usually depend on the 
choices of a coordinate system X, the function u may also contain 
explicitly the metric tensor /,; and the coordinates x’. And, of course, u 
must depend on an assortment of parameters {c} associated with the 
physical properties of the medium. Thus we are led to consider u in the 
form 


(119.7) ue y Ga Oe} 2); 


where the arguments of uw are deemed independent. The relation (119.6) 
then permits us to assert that 


ce = 1 T” and a = T. 
de; P 0s 
The first of these relations 
(119.8) ee 
de 


connects components T of the stress tensor with components ¢,; of the 
strain tensor. It thus yields a set of stress-strain relations, in which the 
internal energy density u serves as a potential function. 

A different potential function can be constructed by defining a function 
¢, known as the free energy, by 


(119.9) $ = u — TS. 
From (119.9), the increment 6¢ of ¢ is 
6d = du — TOS — SOT 
and the substitution in this expression for ĝu from (119.6) gives 
lie 
(119.10) ôd = —7” ôe — S OT. 
P 
Because of the appearance of ôe, and ôT in the right-hand member of 


(119.10), we are now led to regard 7 and ¢,; as independent variables and 
consider ¢ in the form [ef. (119.7)] 


(i 19.1 1) $ = (hij €,;1, {e}, g‘). 
We conclude, then, from (119.10) that 
a$ = Tae op = —S, 
de;; P oT 


340 MECHANICS OF CONTINUOUS MEDIA [CHAP. 6 


so that the stress-strain relation now has the form 


(119.12) 7 =p ie 4 
de, ; 

Thus either u or 4 (when they exist) can be used to deduce the stress- 

strain relations. When the process is adiabatic, S = constant and hence 

6Q = 0 by (119.3). It is then more convenient to use 4 as a stress- 

potential. In the isothermal case, T = constant and ¢ appears to be more 

suitable. 

We say that an elastic medium is homogeneous whenever the coordinates 
xt do not appear explicitly in (119.7) or (119.11). The medium is isotropic 
when all parameters in the set {c} are scalars, so that the values in {c} are 
independent of the choices of the reference frames X. When the medium is 
both homogeneous and isotropic, the parameters {c} have constant values 
throughout the medium. 

If we consider a homogeneous elastic medium and suppose that ¢(¢;;, T) 
is an analytic function of the «e, and of AT = T — Tọ, where Tọ is the 
temperature of the initial state, we can expand ¢ in powers of «,; and AT. 
When the initial state of the body is that corresponding to «€, = 0 and 


rh 


T; = 0, the expansions will begin with the second-order terms, so that 
b= CMe eth eG AT A ANINE aa eni 


For small deformations, the terms of order higher than two can be 
neglected, and we obtain with the aid of (119.12) a linear stress-strain 
relation that includes the effects of temperature on the stress tensor 7”. 
It is 


(119.13) T” = plete, + kË(T — Tr), 


where we replaced p by pọ—the density of the initial state—and wrote 
€,, for the linearized components e. The tensor c”*! characterizes the 
elastic properties of the medium and the k` are related to the coefficients of 
thermal expansion. For a given medium the tensors c’*' and k“ must be 
determined from experiments. When T = 7), the relation 119.13 reduces 
to the familiar generalized Hooke’s law of linear elasticity,® 


(1 19.14) T = coten. 


In the next section we deduce a special form of stress-strain relations for 
large deformations for a homogeneous isotropic elastic medium and get 
from it the familiar Hooke’s law of linear theory of elasticity. 


° See I. S. Sokolnikoff, Mathematical Theory of Elasticity (1956), pp. 58-67, where it 
is shown that the number of independent elastic coefficients c‘*' in the most general 
anisotropic case is 21. 


Sec. 120] STRESS-STRAIN RELATIONS 341 


120. Stress-strain Relations in Isotropic Elastic Media 


When the orientation of coordinate axes is immaterial, the arguments of 
the potential ¢ in (119.11) are scalars or tensors that depend only on the 
metric tensor A, In this event the scalar invariants of tensors h,, and €,; 
can be considered as functions of the invariants #,, defined in Sec. 111, 
and can be taken in the form 

p = AD, D2, Ds, T, {c}, 2°). 
If the medium is both homogeneous and isotropic, ¢ assumes the form 


(120.1) $ = fH, De, 5, T, {c}), 


in which all parameters in {c} are constants. The formula 


7 0d 
ae perl 
[119.12] T p F 
with ¢ specified by (120.1) can be written as’ 
(120.2) ri = (bx! — 26) 26 
Qe, 


where 7; = g,,7'* and €,; = ZiaEj 

If we now suppose that ¢ in (120.1) with T = constant can be ex- 
panded in a power series in the 3; and consider the case when there is no 
initial stress, so that 7; = 0 when «e; = 0, the expansion takes the form 


(120.3) pod = cð + c0 + cð? + CII, + Cy0g + °°. 


If in this expression we retain only the terms of third order in the «,’, we 
see from (120.2) that the expression for the stresses 7;' in terms of the strains 
e;' will contain five elastic coefficients ¢;. From the mass-conservation 
principle it follows that 


Po dto = p dr, 
7 Note that 
ad ap de,* l k 
Te a a 
Deiz ep” lei; P 
Since 


0248 20€,8 = iS j 
Sup = 2€up — hap» ag E Ba 25,05". 


Compute 0¢,/:; from eap = Layt» USE the above result, and conclude that 


s F de,” 
Ôr = 2e = Fia an s 


Formula 120.2 then follows on substituting this result in (a). 


342 MECHANICS OF CONTINUOUS MEDIA [CHAP. 6 


and formulas 112.3 and 112.4 yield the result 

= pl — 0p oe 203), 
if we discard the third-order terms in the «;. The substitution from this 
formula and (120.3) in (120.2) gives the following expression for the stress- 
strain relation, where we retain only the second-order terms in the strains 
(120.4) ae = [2c ð + (303 — 2c,)0," + C405] Oo; 

ar a ar C = co)®] Oe E 4c Hy; 
+ 4c, Oieg%e,° — 2c Ofte, ep. 

These involve five elastic constants. If, however, we retain in (120.4) only 
the first-degree terms in the e,’, we get the linear law 


(120.5) Tj) = (2c, + c)9, Ô; — cej. 
We identify this result with the generalized Hooke’s law for isotropic media 
(120.6) T = Ad, ôF + Que;, Gu es. 


where A and u are Lame’s constants, related to Young’s modulus E and 
Poisson’s ratio o by 
3 Eo E 
LS et rn 
(1 + o)(1 — 20) (1 + 0) 
We see that 
cı = 24+ 2u), co = —2u. 
If we replace c, and c, in (120.4) by these values and set c, = /, c, = m, 
c; = n, we can write it in the form 


(120.7) 7/ = (Ad, + (31 + m — å)0 + md] 6; 
+ [u — (m + 2A + 2u)ð ile’ — 4ue,'c,7 + ndhi, 
where ¢,' is defined by the formula 


$= = 


The new elastic constants /, m, and n appearing in (120.7) are subject to 
experimental determination, just as Lamé’s constants A and u are. 


tBy Te é 
=i pate © z“ 
3 


* Assumptions, of varying degrees of plausibility, about the possible relations that 
might exist between the new constants (/, m, n) and the old ones (A, u) have been made 
by several authors. Murnaghan obtained a good agreement with experimental results 
(for solids subjected to high hydrostatic pressures) by setting / = m = n = Q in formu- 
las 120. A discussion of this appears in a paper by F. D. Murnaghan, “The Compress- 
ibility of Solids under Extreme Pressures,” Th. v. Kármán Anniversary Volume (1941), 
pp- 112-136. See also P. Riz, Comptes rendus (Doklady) Acad. Sci, U.R.S.S., 20 (1938), 
and P. M. Riz and N. V. Zvolinsky, Journal of Applied Mathematics and Mechanics, 
Acad. Sci. U.S.S.R., 2 (1939). 


Sec. 121) EQUATIONS OF ELASTICITY 45 


An excellent discussion of a model of a thermoelastic isotropic medium 
is contained on pages 234-241 of L. I. Sedov’s Introduction to Mechanics 
of a Continuous Medium, Moscow (1962). 


“v 


121. Equations of Elasticity 
If we write the stress-strain relations 120.6 in the form 
21) Ti; = Ag.) + 2p; 


where # = g%e,, = e,’, and use the equilibrium equations 116.3 in the 
form 


(12922) Sarre + F; = 0. 


we can write down the linearized differential equations of equilibrium, in 
terms of the displacement vector u‘, by recalling that (equation 113.13) 


(121.3) e:i; = tua; + U; i)- 


The computation proceeds as follows. The substitution from (121.1) 
into (121.2) yields 
g” (22u P 2 peu | + F;=0 
i ij, I=; 
or 
ao 
(121.4) Src Que eis + Fi = 0. 


Ox" 
But from (121.3) 
g ken k 7 jg’ ¥(uy, jk ar uj, ix) 
1 od 


1 
= — Ky. +- 
v jk oe 


since g7"u; i = ut, and ut, = 9. Thus (121 .4) becomes 
(21.5) (A+ we = a ugu; n + F: = 9. 


If we recall the notation 92.7, 
gu; ix = V’; 
we get 
(121.6) (A+ Hza + pV*u, + F; = 0. 


These are the celebrated Navier equations in the classical theory of elasticity. 


344 MECHANICS OF CONTINUOUS MEDIA [CHAP. 6 


The equations of motion, 
od 2 
(121.7) (A + u) Aut + wV*u; + F; = pa; 
x 


follow at once from (121.6) upon application of the D’Alembert principle. 

The differential equations 121.6 and 121.7 for the displacement vector 
u; can be shown to yield unique solutions when suitable boundary and 
initial conditions are specified. We refer interested readers to treatises on 
the mathematical theory of elasticity where such boundary value problems 
are discussed in detail.® 


122. Fluid Mechanics. Equations of Continuity 


We now turn to the formulation of equations governing the flow of 
liquids and gases. From the point of view of mechanics, fluids are con- 
tinuous distributions of matter which cannot support shearing stresses 
when at rest. If follows from this definition that the stress vector T’ on a 
surface element do of a fluid at rest is normal to the element. In symbols, 


T? = —pr’, 


where »’ is the unit normal to the surface element and p(z!, x?, x3, t) is the 
invariant called the hydrostatic or fluid pressure. \n general the pressure p 
is a function of the time ¢ as well as of the coordinates x’. 
Since the vector T’ is expressible in terms of the stress tensor 7”, and 
y? = gy,, we see that 
T' = ry, = — pg»; 
Hence 


(122.1) ri = —pg”. 


It follows from (122.1) that the hydrostatic pressure p is related to the 
stress invariant © = g,T” (see equation 115.11) by the formula 


(122.2) P = —tg,j7". 


When the fluid is set in motion, however, in addition to the normal 
stresses, new oblique stresses, produced by the interaction of moving 
particles, arise. For instance, if a fluid at rest is placed between two large 
parallel plates and one of the plates is caused to move parallel to the other 
plate (Fig. 53), the fluid particles adhering to the moving plate transmit 

* See, for example, I. S. Sokolnikoff, Mathematical Theory of Elasticity, New York 
(1956). 


A. E. H. Love, A Treatise on the Mathematical Theory of Elasticity, Cambridge 
(1927). 


Sec. 122] FLUID MECHANICS 345 


E A 
eer : 
ro 
FLED LO LGR LL DL OPLOL ELL LLL) 
Fig. 53 


their momentum to the particles in the interior. In this way the fluid 
between the plates is set in motion, and experiments show that the re- 
tarding force per unit area of the plate, exerted on the plate by the fluid, 
is proportional to its velocity and inversely proportional to the distance 
between the plates. The proportionality constant in this relation is the 
measure of viscosity of the fluid. 

We shall say that the fluid is riscous if the stress tensor for a fluid in 
motion has the form 


(122.3) ri = —pgt + t”, 


where the nonvanishing tensor 1” is the tensor of viscous stresses. The 
fluid is called ideal if t” = 0. 
The mass-conservation principle of mechanics requires that 


dm = py dt) = p dr, 


where p(x, to) is the density of matter in the volume element dr) = 
Vh dx dx? dz? of the initial state and p(x, t) is the density in the volume 
element dr = V g dx! dx? dx? at time t. Thus 


(122.4) 


which we can also write as 


a p(x, t) — p(z, to) i dr — dro 


(122.5) T = 


The numerator in the left-hand member of (122.5), Ap= p(x, t) — p(z, to), 
represents the change in density in the small interval of time At = t — to, 
whereas the right-hand member 


dr— dro. 90, by (112.7) 
dt, 


346 MECHANICS OF CONTINUOUS MEDIA  [CuapP. 6 


is the corresponding small change in volume per unit volume. Since 
B,° = div § = uå, (see Sec. 113), we can write (122.5) as 


(122.6) ——f.- = 


and since u’ = v‘Atr, where the v’ denote the components of velocity d&/dr, 
we conclude from (122.6) that —(I/p)(dp/dt) = v';. We thus get the 
continuity equation in the form 


(122.7) 1AP + o0 
pdt 


We recall that p(x, t) is a function of the coordinates x’ in the reference 
frame X in which the x’ are independent of t. If Y is a fixed reference 
frame (cf. Sec. 109) in which the coordinates y’ of the particle are given by 


=o (eae, ¢), 


then the chain rule of differentiation gives for p(y, t) 


dp = (2) + Op v. 
y‘ fixed 


dt Ot oy’ 
Equation 122.7 then assumes the form 

dp ap ; 

— + aes v + v‘; — 0, 

mT 
or 

op i 

(122.8) F + (pr*) = 0. 


In this formula, the covariant differentiation is performed with respect to 
the metric tensor in the frame Y and t’ = dy'/dt. Formula 122.8 special- 
izes to (122.7) when the system Y moves with the particle. 


123. Ideal Fluids. Euler’s Equations 


In this section we deduce a set of equations governing the behavior of 
ideal fluids. We recall from Sec. 122 that in an ideal fluid the stress tensor 
has a simple form 


(123.1) ati = —pg”, 


in which the scalar p is the pressure. 
On substituting from (123.1) in the general dynamical equations 


[116.8] pla = F°) = 4, 


Sec. 123] IDEAL FLUIDS 347 
we get three Euler’s equations 

(123.2) pla — F) = —g"p,, 

or in vector form, 


v 


p(a — F) = —grad p. 
Equations (123.2) involve five unknowns: the density p(x, t), the pressure 
p(x, t), and three components of velocity r‘(x, t), since a’ = ĝvt/ôt. The 
system of three equations (123.2) for the determination of the five un- 
knowns, thus, is not complete, and we need two additional independent 


equations to complete the system. One such equation is the continuity 
equation 


(123.3) Sen. v', = 0, 
pdt 
deduced in Sec. 122. The remaining equation, known as the equation of 


state, is furnished by the thermodynamical equations 118.7 and 118.8, 
which, for reversible processes in the fluid, we write in the form 


(123.4) dQ = dS dm, 
T 
(123.5) dU = dW + dQ. 
The work 
a des 
(123.6) dW = —~— dm, 
Pp 


performed by internal stresses 7” on an element of mass dm = p dr can 
be written as [cf. (119.2)] 


— ij " 
(123.7) awa ee Suar dm, 


when we note equations 123.1. 

We shall suppose that the components e,; of the deformation tensor ina 
fluid are so small that they are represented with sufficiently high accuracy 
by linearized formulas 113.9 or 113.10, which coalesce in the infinitesimal 
theory. Thus we write 

2645 = Wis + Wii 
and 
dey d 
dt dt 
where the e, are the linearized components €;;. Since the velocity com- 
ponents v; = dw;/dt, we conclude from the equation just written that 


(123.8) ou = (0,5 + Vaa) 


(Wis + Wsi) 


348 MECHANICS OF CONTINUOUS MEDIA _ [CuapP. 6 
The substitution for de,,/dt from (123.8) in (123.7) then gives 


dW = — Ë vi,dtdm, 
p 
and, since v, = —(1/p)(dp/dt) by (123.3), we have 
dW = PAP ay dm = -pa(*) dm. 
p dt p 


On making use of this result in (123.5), we get for the amount of heat dQ 
acquired by an element of mass dm, 


(123.9) dQ = dU + pa(5) dm. 


On the other hand, formula 123.4 states that 
(123.10) dQ = TdS dm. 


If we let dq stand for the change in heat per unit mass, so that dg = dQ/dm, 
and denote the change in internal energy U per unit mass by du, we can 
write (123.9) and (123.10) as 


1 
(123.11) r mage a(z): 
da = fas: 


In a variety of problems the absolute temperature T, the internal energy 
density u, and entropy S appear to depend only on the pressure p and 
density p, so that’? 


(123,12) T = T(p, p), S = S(p, p), u = u(p, p). 


It follows then from (123.11) that T, S, and u are not independent, since 
equations 123.11 require that 


(123.13) TdS = du + pa(*). 
p 


When T, S, and u in (123.12) are determined (either experimentally or 
from theoretical considerations), the differential equation 123.13, if 
integrable, specifies p as some function of p: 


(123.14) ~ p=f(p). 


Equation 123.14 is the desired equation of state needed to complete the 
system of four equations 123.2, 123.4 for the determination of the five 
unknown functions vt, v*, v3, p, and p. 


1° These functions may (and usually do) depend on physical or chemical constants 
which characterize properties of a specific fluid. 


Sec. 124] VISCOUS FLUIDS 349 


124. Viscous Fluids. Navier’s Equations 


When viscous fluid is in motion, the components 7;, of the stress tensor 
have the form 


(124.1) TË = —pg” 2 t, 
where the ¢”, as noted in Sec. 122, are associated with viscous stresses. 


As in Sec. 123, we limit ourselves to the consideration of small displace- 
ments and write formula 123.8 as 


(124.2) bis = Hois + 44,0), 
where é, = de,,/dt are components of the strain velocity tensor. 

The construction of models of viscous fluids and the formulation of 
complete systems of equations now call for the introduction of additional 
assumptions about the nature of viscous stresses. The latter must ob- 


viously depend on the strain velocities ê, and, to a first-order approxi- 
mation, it is natural to suppose that 


(124.3) t = ciiklg, 


The coefficients c’*' are the coefficients of viscosity, which depend on the 
properties of a specific fluid under consideration. The linear law (124.3) 
is quite analogous to the generalized Hooke’s law (119.14). 

If the fluid is both homogeneous and isotropic, the number of inde- 
pendent viscosity coefficients reduces to two, and the relation 124.3 
assumes the form [cf. (121.1)] 


(124.4) t = hg” + Que, 
where / and y are constants and vr’, is the divergence of the velocity field. 


Accordingly, the complete stress tensor which includes the effects of 
viscosity and hydrostatic pressure can be written in the form 


(124.5) Ti = —PEu + MOGs + 2Me sy, 
where ò = v'; = gé,,. l 

We recall next the equations of motion 116.8, and write them in the 
covariant form 
(124.6) PT = pla; — F,), 
and substitute in (124.6) for the 7,, from (124.5). The result" is Navier’s 
equations of fluid motion 


(124.7) (A+ wo. + wg" ?s sn Pu = Pla — Fi), 
dv' — dvt : dv 
" Note that af = = = > + 01,0’, so that a; = ay T Pai 


350 MECHANICS OF CONTINUOUS MEDIA [CHAP. 6 


or, in vector form, 

(A+ WV + uV — Vp = pla — F). 
The set of three Navier’s equations 124.7 involves five unknowns: r(x, t), 
(i = 1, 2, 3), p(x, t), and p(x, t). To complete the system, we adjoin (as in 
the case of ideal fluids) the equation of state and the continuity equation 
123.3. For incompressible fluids dp/dt = 0, and hence ti; = = 0 by 
(123.3). Accordingly, for incompressible fluids, equation 124.7 yields 
(124.8) ug” Vim — Pa = pla, — F). 
Furthermore if the fluid is ideal, 4 = 0 and equations 124.8 reduce to 
Euler’s equations (123.2). 

Stokes simplified equations 124.7 by introducing a hypothesis to the 
effect that the mean pressure p in a viscous fluid is given by the same formula 
122.2 as in the case of fluids at rest. This assumption leads to the con- 
clusion that the constants A and yw are not independent. Indeed, from 
(124.1) 

big = Tig + PB is; 
hence 
Bt = BP Ts + PS Bis 
= —3p + 3p = 0, 
if we use formula 122.2. But since t, is given by (124.4), we have upon 
multiplying those equations by g”, 
Ag’gi9 + u2g”ė,; = 0, 


or 
(3A + 2u)d = 0. 

Thus 

(124.9) 3A + 2u = 0. 


As a consequence of this relation, equations 124.7 depend only on one 
viscosity coefficient x, and the substitution from (124.9) in (124.7) yields 
the set of Navier-Stokes’s hydrodynamical equations 


(124.10) ug", 2» + = — — = = pla — F). 
x es 
If the fluid is ideal we get, on setting u = 0 and 
ðv; 
a; = — + n o, 
a 
the Eulerian hydrodynamical equations 
ðv; 1 Op 


(124.11) SE Fy oe ny, 


for ideal compressible fluids. 


Sec. 124] VISCOUS FLUIDS 351 
If the motion is slow, the term r,t? can be disregarded, and then 
a= ou) Oi. 
Problems 


1. Show that the equation characterizing an incompressible fluid can be 
written in the form 
1 avg vi) 
ac) ce where g = |g;;l. 


2. Show that the Navier-Stokes equations can be written 


dvi pi 20" a av! ("| av! | 1) av’ 
— = y — — - — 


at and at t (ik) dx) T \yl Bak 


al me cc 0 ed a 
tA E mk (jf Um aI? | pF 808 
əvi il ae a (dv! Na ; 
Š PE e li’) + 


where v = „jp is the kinematic viscosity. 
3. Show that the equation of continuity can be written 


a Apr‘) 3log Vg 
a ae ar a 


Hint: Use the expression for v’, in Problem 1. 
4. Show that the equation of continuity in cylindrical coordinates 


[eu = 1, Za = Cde 63305 1] 


dp O(pv 9) vi 
mE pop 
ôt x’ 


= 0 


zi 
and in spherical polar coordinates [gy = 1, oo = (2")?, ggg = (x) sin? x2] is 


op a(pv*) 2v! 2 2 
— = =0. 
or p axt Feya Eat coh 


5. The curlv of the velocity field v is equal to twice the angular velocity of 
rotation. The vector w such that curly = 2w is called the vorticity vector. 
Show that œ, = 0. Hint: wt = —Ze*?*v; w 

6. If the vorticity vector wt = 0, the motion is called irrotational. Show that, 
if the motion is irrotational, the velocity vector v is the gradient of the velocity 
k an the approximate equations of motion of a viscous fluid when the 


motion is slow. 


352 MECHANICS OF CONTINUOUS MEDIA — [Cuap. 6 


125. Remarks on Turbulent Flows and Dissipative Media 


We conclude our brief survey of the elements of mechanics of continua 
with a few remarks on turbulent flows of fluids and on construction of 
models for media, in which the processes are irreversible. 

Fluid flows in which the velocity components v’ experience complicated 
pulsating changes are called turbulent. In dealing with turbulent flows of 
liquids and gases it is natural to represent the velocity components in the 
form vê = öt + v’, where 0° is the mean value of v’ over a suitable period 
of time and v” is the pulsating component of v’. Similar resolutions into 
mean and pulsating components can be made for the pressure p and density 
p, so that p = p + p' and p= p+ p'. The development of the theory 
of turbulent flow crucially depends on the character of averaging proc- 
esses used to compute ï’, J, and p and on the formulation of relations 
among these average quantities. 

If one assumes, for example, that the pulsating components v’, p’, and 
p are governed by the Navier-Stokes equations for an incompressible 
fluid, then one averaging process applied to Navier’s equation leads to a 
set of equations obtained by Reynolds.!? These equations involve not 
only the 5, but also the mean values of the pulsating components of 
velocity. Because of the presence of these latter components, the system 
of Reynolds equations is incomplete and new hypotheses, based on experi- 
mental evidence, must be introduced to complete the system. 

It appears unlikely that a unified formulation of satisfactory models for 
turbulent flows of compressible viscous fluids or for viscoelastic and plastic 
solids can be constructed within the framework of classical mechanics 
and thermodynamics. The development of such models is likely to be 
based on statistical mechanics in which mechanical characteristics are 
viewed as probabilities and their values appear as mathematical expecta- 
tions. 

A discussion of models of plastic and viscoelastic materials, utilizing 
the principles of thermodynamics of irreversible processes, is contained in 
a monograph by L. I. Sedov, cited in footnote 12 and in A. Cemal 
Eringen’s Non-linear Theory of Continuous Media (New York), 1962. 


12 See, for example, H. Schlichting, Boundary Layer Theory, New York (1955), 
Chapter XVIII, and L. I. Sedov, Introduction to Mechanics of a Continuous Medium, 
Moscow (1962), pp. 213-217. 


BIBLIOGRAPHY 


P. Appell, Traité de méchanique rationelle, vol. 5 (Paris, 1926). 

L. P. Eisenhart, Riemannian Geometry (Princeton, 1926). 

T. Levi-Civita, The Absolute Differential Calculus (London, 1927). 

A. S. Eddington, The Mathematical Theory of Relativity (Cambridge, 1930). 

A. J. McConnell, Applications of the Absolute Differential Calculus (London, 1931). 
O. Veblen, Invariants of Quadratic Differential Forms (Cambridge, 1933). 

T. Y. Thomas, Differential Invariants of Generalized Spaces (Cambridge, 1934). 

R. B. Lindsay and H. Margenau, Foundations of Physics (New York, 1936). 

L. Brillouin, Les tenseurs en mécanique et en élasticité (Paris, 1938). 

C. E. Weatherburn, Riemannian Geometry and the Tensor Calculus (Cambridge, 1938). 
L. P. Eisenhart, An Introduction to Differential Geometry (Princeton, 1940). 

P. G. Bergmann, An Introduction to the Theory of Relativity (New York, 1942). 

A. D. Michal, Matrix and Tensor Calculus (New York, 1947). 

J. L. Synge and A. Schild, Tensor Calculus (Toronto, 1949), 


. Y. Rainich, Mathematics of Relativity (New York, 1950). 

. J. Struik, Lectures on Classical Differential Geometry (Cambridge, Mass., 1950). 

. D. Murnaghan, Finite Deformation of an Elastic Solid (New York, 1951). 

. E. Green and W. Zerna, Theoretical Elasticity (Oxford, 1954). 

S. Sokolnikoff, Mathematical Theory of Elasticity (New York, 1956). 

L. Synge, Relativity: The Special Theory (Amsterdam, 1956). 

. L. Synge, Relativity: The General Theory (Amsterdam, 1960). 

. Prager, Introduction to Mechanics of Continua (Boston, 1961). 

T. Y. Thomas, Concepts from Tensor Analysis and Differential Geometry (New York, 
1961). 

A. C. Eringen, Nonlinear Theory of Continuous Media (New York, 1962). 

J. C. H. Gerretsen, Lectures on Tensor Calculus and Differential Geometry (Groningen, 
1962). 

L. I. Sedov, Introduction to Mechanics of a Continuous Medium (Moscow, 1962). 


So) ol a> Sool ale 


<a 


353 


INDEX 


Absolute derivative, 127 

Absolute tensor, 71 

Acceleration, 207, 287, 350 

Action integral, 232 

Action, principle of least, 229 

Admissible functional arguments, 148 

Admissible transformations, 52 

Affine transformation, 10, 80 

Algebra of tensors, 64 

Angle, between coordinate lines, 118 
between directions in space, 117 
between directions on a surface, 144 

Anisotropic media, 340 

Appell, P., 79, 256, 286, 313, 353 

Arc length, along a curve in space, 130 
along a curve on a surface, 142 
along coordinate lines, 117 
element of, 72, 92, 106, 142, 203 

Area, element of, 146 

Associated tensors, 74 

Axiom, of dimensionality, 10 
of parallels, 105 

_ Axioms for linear vector spaces, 10 


Beltrami, E., 106 
Bergmann, P. G., 300, 311, 312, 353 
Bernoulli, D., 230 
Bertrand, J. L. F., 136 
Bianchi’s identities, 91 
Binormal, 133 

Birkhoff, Garrett, 33 
Birkhoff, G. D., 300, 304 
Bliss, G. A., 245 

Bolyai, J., 106 

Bonnet, O., 202 

Bolza, O., 232 

Bouquet, J. C., 95 


355 


Brachistochrone, 245 
Brillouin, L., 313, 353 


Calculus of variations, 147—156 
fundamental lemma in, 149 
fundamental problem of, 148 

Cantor, M., 1 

Carathéodory, C., 232 

Cartan, E., 95 

Cauchy, A. L., 330 

Cauchy-Schwarz inequality, 204 

Cayley, A., 112 

Characteristic values of matrices, 32, 

36 

Christoffel, E. B., 81 

Christoffel symbols, 79 
transformation of, 80 

Clemence, G. M7 311 

Closure, property of, 54 

Codazzi equations, 185 

Collar, A. R., 252 

Compatibility, equations of, 326 

Components of tensors, 50, 60 
laws of transformation for, 58—62 

Components of vectors, 7, 13 
physical, 8; 121, 214 

Conservation, of energy, 214, 228, 239, 

297 
of mass, 297, 345 

Conservative force fields, 212, 217 

Constraints, nonholonomic, 156, 242 

Continuity, equation of, 344, 346 

Contraction, in relativity, 288 
of tensors, 65 

Contravariant and covariant laws, 59, 

62 
tensor character of, 62 


356 


Contravariant tensor, 61 
Contravariant vector, 60 
Coordinate curves (or lines), 113 
Coordinate surfaces, 113 
Coordinate systems, 1, 9 

construction of, 1 

oblique cartesian, 3 

orthogonal cartesian, 3, 12 
Coordinates, curvilinear, 112, 138 

cylindrical, 114 

Gaussian, 140 

generalized, 233 

geodesic, 162 

local, 292 

normal, 47, 252 

orthogonal, 118, 145 

proper, 292 

spherical, 52, 114 

transformation of, 10, 51, 140 
Correspondence, one-to-one, 1, 9 
Cosine of an angle, 203 
Covariant and contravariant laws, 59— 

62 

tensor character of, 62 
Covariant differentiation, 81—89 

inversion of order of, 88 
Covariant tensor, 58 
Covariant vector, 57 
Cramer’s rule, 18 
Curl of a vector, in cartesian coordi- 

nates, 266 

in curvilinear coordinates, 268 
Curvature, Einstein, 168 

Gaussian, 167, 186 

geodesic, 170, 188 

integral, 202 

lines of, 192 

mean, 186 

normal, of a surface, 189. 

of a curve, 131, 136 

radius of, 189 

total, 167, 186 
Curvature vector, 133 
Curvatures, principal, 191 
Curve, motion of particle on a, 215 
Curves, coordinate, 113 

in space, 130, 203 

on a surface, 187 

smooth, 216 


INDEX 


Curvilinear coordinates, in space, 112 
on a surface, 138 
Cycloidal pendulum, 218 


D’Alembert’s principle, 332 
Darboux, G., 95 
Dedekind, J. W. R., 1 
Deflection of light rays, 311 
Deformation, of space, 25, 314 
analysis of, 314-327 
Deltas, Kronecker, 13, 18, 98, 104 
Density, scalar, 70 ~ a 
Derivative, absolute, 127 U i 
covariant, 81, 84 
intrinsic, 127 
of a base vector, 126 
of a vector, 81, 124 
of an invariant, 81 
tensor, 177 
Descartes, R., 1 
Determinants, 17, 101 
differentiation of, 103 
expansion of, 18, 103 
multiplication of, 17, 102 
Vandermondian, 33 
Differentiation, covariant, 81 
intrinsic, 127 
tensor, 177 
Dilatation, 323 
Dimensionality of space, axiom for, 10 
Direction, in space, 116, 203 
on a surface, 143 
principal, 191 
Direction moment, 143 
Dirichlet’s problem, 274 
Displacement vector, 4, 207, 324 
Distance, Euclidean, 115, 203 
Distortion of volume elements, 322 
Divergence of a vector, in cartesian co- 
ordinates, 264 C Vigt tl 
in curvilinear coordinates, 266 (9... 
in cylindrical coordinates, 267 
in plane polar coordinates, 267 
in spherical coordinates, 267 
Divergence theorem, 264 
Duncan, W. J., 252 
Dupin’s theorem, 195 
Dynamics, of a particle, 207 


INDEX 


Dynamics, of n particles, 233 
. of rigid bodies, 233 


e-systems, 97, 146 
application of, to determinants, 101 
e-systems, 133, 146 
derivatives of, 134, 180 
tensor character of, 133 
Eddington, A. S., 299, 311, 312, 353 
Eigenvalues and eigenvectors, 32 
Einstein, A., 59, 92, 288, 290, 296, 
298 
Einstein curvature, 168 
Einstein’s energy equation, 295 
Einstein’s gravitational equations, 298 
Einstein’s postulates, 289 
Einstein’s tensor, 92 
Eisenhart, L. P., 159, 166, 186, 198, 
200, 353 
Elastic constants, 342 
Elasticity, equations of, 338, 343 
Energy, 209, 336 
conservation of, 214, 228, 239, 297 
equation of, 214, 217, 228 
free, 339 
integral of, 228 
internal, 337 
kinetic, 211, 335 
potential, 212, 339 
Entropy, 337 
Equilibrium, differential equations of, 
330 
Eringen, A. C., 353 
Euclidean space, 4, 25, 72, 92, 108 
Euclid’s axiom of parallels, 105 
Euclid’s Elements, 105 
Euler, L., 152, 230 
Eulerian hydrodynamical 
346, 350 
Euler’s equations, 152-156 
Extremals of functionals, 150, 153 
` Extremum, constrained, 153, 242 


equations, 


Fermat’s principle, 229 
Fermi, E., 163 
Field, conservative, 212 
tensor, 62 
vector, 123 
Fitzgerald, G. F., 219 


357, 


Fluid, ideal, 346, 350 
Fluid, incompressible, 350 
viscous, 345 
Flux of a gravitational field, 268 
Force, 208 s 
Forces, external and internal, 240 
generalized, 238 
reactive, 240 
workless, 241 
Frazer, R. A., 252 
Free indices, 17 
Frenet formulas, 134—136 
Frequency equation, 252 
Functional, 148 
Functions, linear vector, 24 
of class C”, 51 
scalar point, 54 
Fundamental quadratic 
140, 142 
second, 180 
Fundamental tensor, 74 


form, first, 


Galilean transformations, 287 
Galileo, 207 
Gauss-Bonnet theorem, 198 
Gauss, equation of, 185 
formulas of, 182, 184 
Gauss, K. F., 106, 177, 263 
Gauss’ equations of a surface, 139 
Gauss’ flux theorem, 268 
Gaussian curvature, 167, 186 
Generalized coordinates, 234 
Generalized force, 238 
Generalized momentum, 258 
Generalized velocities, 234 
Generalized virtual displacements, 241 
Geodesic coordinates, 162 
Geodesic curvature, 170 
Geodesics, 157 
trajectories as, 233 
Geometrization of dynamics, 233 
Geometry, Lobachevskian, 111 
metric, 107 
non-Euclidean, 105 
Riemannian, 107 
Gerretsen, J. C. H., 204, 353 
Gravitation, Einstein’s law of, 298 
Newton’s law of, 259 
Green, A. E., 313, 353 


358 INDEX 


Green, G., 263 Lagrange, J. L., 208, 230 
Green's function, 275, 278 Lagrangean equations of motion, 212, 
Green’s theorems, 264, 273 235, 242 
Griffith, B. A., 256, 286 Lagrangean function, 213 
Group, abstract, 54 Lamé’s constants, 342 
Groups, isomorphic, 56 Landau, L., 312 

Laplace’s equation, 89 
Hamilton, W. R., 208, 230 Laplacian, 265 
Hamiltonian function, 257 in cartesian coordinates, 89 
Hamilton’s equations, 256 in curvilinear coordinates, 89 
Hamilton’s principle, 226 in cylindrical coordinates, 267 
Harmonic function, 273 in plane polar coordinates, 267 
Helix, 137, 160 in spherical coordinates, 267 
Hermitean matrices, 47 Length, element of, 73, 96, 107, 203 
Holonomic systems, 156, 235, 242 of a vector, 11, 203 
Hooke’s law, 340 Levi-Civita, T., 163, 300, 353 
Huygens, C., 218 Lifshitz, E., 312 
Hydrodynamics, equations of, 344-351 Light, velocity of, 288, 289 
Hydrostatic pressure, 344 Light rays, deflection of, 311 

Lindsay, R. B., 209, 353 
Ideal fluid, 346 Line, straight, 137 
Incompressible fluid, 350 Line-element, in space, 73, 96, 107, 
Indices, free, 17 203 

summation, 16 on a surface, 142 

Inertial systems, 207 Linear dependence, 6, 15 
Infinitesimal strains, 326 of vectors, 6. 15 
Inner product of tensors, 66 Linear transformations, 19, 28 
Integrability conditions, 95, 183, 326 


Linear vector spaces, complex, 14 
Interval, 292 real, 10 


Intrinsic differentiation, 126 
Intrinsic geometry, 138 
Invariance, concept of, 50 
of physical laws, 287 
transformation by, 54 
Invariants, 51 
Irrotational motion. 351 
Isometric surfaces, 164 
Isotropic media, 340 


Lobachevskian geometry, 111 

Lobachevsky, N., 105 

Local coordinates, 292 

Lorentz, H. A., 288, 289, 291 

Lorentz-Einstein transformation, 290 

Lorentz-Fitzgerald contraction, 288, 
290 

Love, A. E. H., 344 


Jacobi, C. G. J., 230, 232 pace Seale 
Jacobian determinants, 53 MacLane, S., 33 
Manifold, 9 

Kellogg, O. DS 263, 264 n-dimensional, 10, 202 
Kepler’s law, 259, 285 non-Euclidean, 203 
Kinetic energy, 211, 335 Riemannian, 92 
Kinetic potential, 238 Margenau, H., 209, 353 
Klein, F., 112 Mass, conservation of, 297, 345 
Kronecker deltas, 13, 19, 98 gravitational, 209 

derivatives of, 104 inertial, 209 


tensor character of, 101 rest of proper, 294 


INDEX 


Mass-energy relationship, 297 
Matrices, 20 
algebra of, 20-24 
characteristic equation of, 36 
characteristic values of, 32, 36 
diagonal, 21 
Hermitean, 47 
inverse, 23 
orthogonal, 28 
real symmetric, 34 
reduction to diagonal form, 30 
similar, 29 
singular, 22 
unitary, 47 
Maupertuis, P. M. L., 229 
McConnell, A. J., 127, 178, 179, 180, 
183, 186, 313, 353 
MeVittie, G. C., 311 
Mean curvature, 186 
Measure numbers of a vector, 7 
Mechanics of a particle, 206 
Metric space, 9, 107 
Metric tensor, 72, 142 
Meusnier’s theorem, 187 
Michal, A. D., 313, 353 
Minimum principles, 229 
Minkowski’s acceleration, 293 
Minkowski’s velocity, 293 
Moment of force, 260 
Momentum, 208 
Motion, equations of, for a continuous 
medium, 332 
irrotational, 351 
Motion of a particle on a curve, 215 
Motion of a particle on a surface, 219 
Murnaghan, F. D., 33, 313, 327, 342, 
353 


Natural system, 234 
Natural trajectory, 216 
Navier equations, 342 

of fluid motion, 349 
Navier-Stokes’ hydrodynamical 

tions, 350 

Neumann’s problem, 275 
Newton, I., 207, 259, 281 
Newtonian law of gravitation, 259 
Newtonian laws, 207 
Nirenberg, L., 205 
Non-holonomic systems, 235, 242 


equa- 


359 


Normal coordinates, 47, 252 
Normal curvature, of a surface, 189 

principal, 190 
Normal line to a surface, 175 
Normal, modes of vibration, 47, 254 
Normal vector, to a curve, 132 

to a surface, 175 

to a surface curve, 170 
Novozhilov, V. V., 313 


Orthogonal curvilinear coordinates, con- 
dition for, 118 

Orthogonal transformations, 27, 29 

Orthogonality of vectors, 11, 145 

Ortho-normal systems of vectors, 8, 11, 
12 

Osculating plane, 131 


Paire ae 
Parabolic points, 194 
Parallel postulate, 105 
Parallel vector fields, along a curve, 
128 
along a surface curve, 163 
Parallel surfaces, 195 
Parallelogram law of addition, 4 
Pars, L. A., 245 
Particles, dynamics of, 207, 233 
relativistic dynamics of, 298 
Pendulum, cycloidal, 218 
Pendulum, double, 250 
simple, 219, 249 
spherical, 223, 256 
Perihelion, advance of, 308 
Perihelion constant, 286 
Perihelion of Mercury, 307, 310 
Physical components of a vector, 8, 
121, 214 
Planetary orbits, 304 
Pogorelov, A. V., 200, 205 
Poincare, H., 112, 289 
Poisson’s equation, 263, 271 
Poisson’s integral, 281 
Poisson’s ratio, 342 
Potential, elastic, 339 
gravitational, 262 
kinetic, 238 
velocity, 351 
Potential energy, 212 
Prager, W., 314, 353 An 
A \oN : 


per muh 


360 


Primary inertial system, 207 
Principal curvatures of a surface, 191 
Principal directions of strain, 320 
Principal directions of stress, 329 
Principal directions on a surface, 191 
Principal strains, 320 

Principal stress, 329 

Principle of least action, 231 
Problem of two bodies, 281 

Proper mass, 294 

Pythagoras, formula of, 3, 13 


Quadratic forms, 34 
characteristic values of, 32, 37, 48 
classification and properties of, 44 
index of, 44 
rank of, 44 

Quadric of Cauchy, strain, 319 
stress, 330 

Quotient laws of tensors, 66 


Rainich, G. Y., 299, 310, 312, 353 

Rank of a tensor, 61 

Rapidity, 292 

Reciprocal base systems, 119 

Redheffer, R. M., 262 

Regression, edge of, 195 

Relative scalar, 70 

Relative tensors, 69, 103 

Relativistic dynamics, 298 

Relativity, general theory of, 298-304 
restricted theory of, 288-274 

Reynold’s equations, 352 

Ricci, G., 59 

Ricci tensor, 91 

Ricci’s identity, 89 

Ricci’s theorem, 86 

Rice, J., 290 

Riemann-Christoffel tensor, 86, 88 
properties of, 89 

Riemannian geometry, 107 

Riemannian space, 92, 107 

Riemann’s dissertation, 106 

Riz, P., 332 


Saint Venant, B., 326 
Savile, H., 105 
Scalar, 54 

Scalar density, 70 


INDEX 


Scalar product, 5, 13, 117 
triple, 121 
Schild, A., 353 
Schlichting, H., 352 
Schwarzschild, K., 300 
Schwarzschild’s line element, 301 
Sedov, L. I., 314, 327, 343, 352, 353 
Serret-Frenet formulas, 139 
Seugling, W. R., 327 
Shearing strains, 318 
Signorini, A., 313 
Similar transformations, 26 
Skew-symmetric systems, 97 
Skew-symmetric tensors, 69 
Small oscillations, 253 
Sokolnikoff, I. S., 51, 262, 263, 326, 
340, 344, 353 
Space, dimensionality of, 6, 9 
Euclidean, 4, 92, 202 
metric, 10, 107 
Riemannian, 92, 202 
Space curves, geometry of, 130 
Space-time manifold, 289 
Spaces, complex linear vector, 14 
Euclidean, 4, 92, 202 
linear vector, 6 
Spectral lines, shift of, 311 
Spherical excess, 201 
Spherical points, 194 
Spherically symmetric static field, 300 
State, equation of, 347 
Stevinus, S., 4 
Stokes, G. G., 263 
Stokes’ theorem, 266 
Straight line, equation of, 137 
Strain, in cartesian coordinates, 326 
infinitesimal, 326 
interpretation of, 317 
principal directions of, 320 
velocity, 349 
Strain invariants, 320 
Strain quadric, 319 
Strain tensor, 316, 326 
Stress, analysis of, 327—332 
principal, 329 
types of, 330 
Stress invariants, 330 
Stress quadric, 330 
Stress-strain relation, 339, 340, 341 


INDEX 361 


Stress tensor, 328 Total curvature, 167, 186 

symmetry of, 331 Trajectories as geodesics, 232 
Stress vector, 328 Trajectory, of a dynamical system, 238 
Struik, D. J., 200, 202, 353 of a particle, 210, 216 
Summation convention, 16 Transformation theorems, 263 
Surface, curves on, 187 Transformations, admissible, 52 

element of, 146 affine, 10 

equations of, 139 Galilean, 287 

intrinsic geometry of, 140 induced, 58 

particle on a, 219 of rotation, 28 
Surfaces, isometric, 165 orthogonal, 28 

parallel, 195 similar, 26 

tangent, 195 unitary, 47 

topologically equivalent, 202 Truesdell, C., 314 
Symmetric systems, 97 Turbulent flow, 352 


Symmetric tensors, 69 
Synge, J. L., 256, 286, 290, 296, 300, Umbilical points, 194 


304, 310, 310, 312, 353 Unitary transformations, 47 
Tangent surfaces, 195 Vandermondian determinant, 33 
Tensor derivatives, 177 Variation, symbol of, 224 
Tensor equations, 64 Variation, of strain tensor, 334 
Tensor fields, 62 Veblen, O., 9, 59, 353 
Tensors, absolute, 71 Vector spaces, n-dimensional, linear, 10 
algebra of, 64 Velocity of a particle, 207 
Tensors, associated, 74 Velocity strains, 349 
calculus of, 81-86 Virtual displacement, 241, 332 
components of, 50, 60 Virtual work, 240 


Viscosity, coefficients of, 349 
kinematic, 351 

Viscous fluid, 345, 349 

Voigt, W., 289 

Volume, element of, 118, 204 

Vorticity vector, 351 


contraction of, 65 

contravariant, 61 

covariant, 58 

covariant differentiation of, 81—86 
fundamental, 74, 181 

intrinsic differentiation of, 127 


meine, o Weatherburn, C. E., 353 
ee 6l Weight of a tensor, 71 
quotient laws for, 66 Weierstrass, K., 147 
rank of, 61 Weingarten’s formulas, 185 
relative, 71 Weyl, H., 59, 300 
Riemann-Christoffel, 86 Whittaker, E. T., 223 
symmetric and skew-symmetric, 69 Work, definition of, 210 
tensor differentiation of, 177 function, 212 
types of, 59 virtual, 241, 332, 334 
Thermodynamic laws, 336 
Thermoelastic equations, 340 Young’s modulus, 342 
Thomas, T. Y., 95, 198, 353 
Tolman, R., 290, 312 Zerna, W., 313, 353 
Torsion, 133, 137 Zvolinsky, N. V., 318 
V v $) 
A 


RICHARD C. FREY 
1003 SUNSET 
CINCINNAT! 5, OHIO 


about the author... 


l. S. SOKOLNIKOFF is Professor 
of Mathematics at the University of 
California, Los Angeles, a position 
he has held since 1946. He received 
his B.S. from Idaho University and 
his Ph.D. from the University of Wis- 
consin where he taught from 1927 
to 1946. 

He worked with the National De- | 
fense Research Committee and in 
the Office of Scientific Research and 
Development in a number of posi- 
tions from 1941 to 1946. The recipi- 
ent of two Guggenheim fellowships, 
Dr. Sokolnikoff has also conducted 
research under three grants from 
the Research Corporation of New 
York. He is a member of the Ameri- 
can Mathematical Society. 

The author of numerous journal 
articles in the mathematical theory 
of elasticity, he has also written a 
number of books on this as well as 
cther topics. He is the editor of the 
Quarterly of Applied Mathematics. 


Advanced Engineering 
Mathematics 


By ERWIN KREYSZIG, The Ohio State Univefsity. “The 
book is well written, and both the exposition and the exten- 
sive sets of problems reflect a balance between mathe- 
matical theory and emphasis on applications. The book is 
a suitable textbook for a three- or a four-semester course 
or, by omitting certain sections, it can be used for separate 
one-semester courses in several areas. It is an excellent 
reference book for engineers and furnishes a handy guide 
to the more commonly used mathematical theory, with 
references to more detailed treatments for those that are l 
interested.” —Leon W. Rutland in Science. 


Introduction to Vector 
and Tensor Analysis 


By ROBERT C. WREDE, San Jose State Co 
book is a careful presentation of the fund 
used in developing geometry, analysis, and I 
It is centered around four pivotal ideas: historic 
tive and motivation; the interrelationships of 


possible the section on special relativity, in 
geometric structure of the theory. 


John Wiley & Sons 


New York * London 


